25th October 2018 / Reverse Engineering

Reversing ESP8266 Firmware (Part 6)

At this point we’re actually reversing ESP8266 firmware to understand the functionality, specifically, we’d like to understand what the loop function does, which is the main entry point once booted.

Reversing the loop function

I’ve analysed and commented the assembly below to detail guessed ports, functions and hostnames:

.code_seg_0:402073D0 loop:                                   ; CODE XREF: .code_seg_0:loc_40209F9Dp
.code_seg_0:402073D0                 addi            a1, a1, 0xA0
.code_seg_0:402073D3                 s32i            a15, a1, 0x4C
.code_seg_0:402073D6                 l32r            a15, p_5000
.code_seg_0:402073D9                 s32i            a0, a1, 0x5C
.code_seg_0:402073DC                 mov.n           a2, a15 ; wait 5 seconds
.code_seg_0:402073DE                 s32i            a12, a1, 0x58
.code_seg_0:402073E1                 s32i            a13, a1, 0x54
.code_seg_0:402073E4                 s32i            a14, a1, 0x50
.code_seg_0:402073E7                 call0           delay
.code_seg_0:402073EA                 l32r            a2, dword_40207390
.code_seg_0:402073ED                 l32r            a14, dword_402072B8
.code_seg_0:402073F0                 l32i.n          a3, a2, 0
.code_seg_0:402073F2                 mov.n           a13, a14
.code_seg_0:402073F4                 addi.n          a3, a3, 1
.code_seg_0:402073F6                 s32i.n          a3, a2, 0
.code_seg_0:402073F8                 l32r            a3, off_402072C0
.code_seg_0:402073FB                 mov.n           a2, a14
.code_seg_0:402073FD                 call0           _ZN5Print5printEPKc ; Print::print(char const*)
.code_seg_0:40207400                 l32r            a12, hostname
.code_seg_0:40207403                 mov.n           a2, a14
.code_seg_0:40207405                 l32i.n          a3, a12, 0
.code_seg_0:40207407                 call0           _ZN5Print7printlnEPKc ; Print::println(char const*)
.code_seg_0:4020740A                 mov.n           a2, a1
.code_seg_0:4020740C                 call0           sub_40208454
.code_seg_0:4020740F                 l32i.n          a3, a12, 0
.code_seg_0:40207411                 movi            a4, 1337 ; connect to port 1337
.code_seg_0:40207414                 mov.n           a2, a1
.code_seg_0:40207416                 call0           guessed_connect
.code_seg_0:40207419                 movi            a2, 1000
.code_seg_0:4020741C                 call0           delay   ; wait 1000ms before next connection attempt
.code_seg_0:4020741F                 l32i.n          a3, a12, 0
.code_seg_0:40207421                 l32r            a4, port_8000
.code_seg_0:40207424                 mov.n           a2, a1
.code_seg_0:40207426                 call0           guessed_connect ; connect to port 8000
.code_seg_0:40207429                 movi            a2, 1000
.code_seg_0:4020742C                 call0           delay
.code_seg_0:4020742F                 l32i.n          a3, a12, 0
.code_seg_0:40207431                 l32r            a4, port_3306
.code_seg_0:40207434                 mov.n           a2, a1
.code_seg_0:40207436                 call0           guessed_connect ; connect to port 3306
.code_seg_0:40207439                 movi            a2, 1000
.code_seg_0:4020743C                 call0           delay
.code_seg_0:4020743F                 l32i.n          a3, a12, 0
.code_seg_0:40207441                 l32r            a4, port_4545
.code_seg_0:40207444                 mov.n           a2, a1
.code_seg_0:40207446                 call0           guessed_connect ; connect to port 4545
.code_seg_0:40207449                 movi            a2, 1000
.code_seg_0:4020744C                 call0           delay
.code_seg_0:4020744F                 l32i.n          a3, a12, 0
.code_seg_0:40207451                 mov.n           a2, a1
.code_seg_0:40207453                 movi            a4, 445 ; our final port!
.code_seg_0:40207456                 call0           guessed_connect
.code_seg_0:40207459                 bnez.n          a2, loc_40207469

From the above, we’ve determined that:

.code_seg_0:40207400                 l32r            a12, hostname

Is loading a pointer to the hostname variable into the a12 register. This is followed by loading of what looks like a port number into various other registers, again followed by a call0 instruction. This behaviour led me to guess this is likely our connect() function.

From this analysis, we’ve determined our port knocking sequence to be as follows:

1337
8000
3306
4545

With the application connecting predictably, to services.ioteeth.com on port 445.

With that, we’ve effectively solved the challenge! All that’s left is to get the secrets!

Getting the secrets!

In order to obtain the secrets, we need to knock on the now known ports in the correct order. We can do this in various ways, using nmap or even netcat, but I prefer to use the knock binary, as it’s purpose built (and is part of the knockd package).

josh@ioteeth:/tmp$ knock -v services.ioteeth.com 1337 8000 3306 4545
hitting tcp 192.168.1.69:1337
hitting tcp 192.168.1.69:8000
hitting tcp 192.168.1.69:3306
hitting tcp 192.168.1.69:4545

josh@ioteeth:/tmp$ nmap -n -PN -F -v services.ioteeth.com -oN services.ioteeth.com.out
Warning: The -PN option is deprecated. Please use -Pn

Starting Nmap 7.40 ( https://nmap.org ) at 2018-05-25 11:55 BST
Initiating Connect Scan at 11:55
Scanning services.ioteeth.com (192.168.1.69) [100 ports]
Discovered open port 445/tcp on 192.168.1.69
Discovered open port 22/tcp on 192.168.1.69
Completed Connect Scan at 11:55, 0.00s elapsed (100 total ports)
Nmap scan report for services.ioteeth.com (192.168.1.69)
Host is up (0.0012s latency).
Not shown: 98 closed ports
PORT    STATE SERVICE
22/tcp  open  ssh
445/tcp open  microsoft-ds

Read data files from: /usr/bin/../share/nmap
Nmap done: 1 IP address (1 host up) scanned in 0.03 seconds

Accessing the service, we receive the following:

josh@ioteeth:/tmp$ curl http://services.ioteeth.com:445/get_secrets
Well done! We hope you had fun with this challenge and learned a lot!

flag{esp8266_reversing_is_awes0me}

Conclusion

In this post, we set out to understand how a particular firmware image communicated with external services to apparently obtain secrets. We knew nothing about the firmware initially and wanted to describe a methodology for analysing unknown formats.

Ultimately we’ve taken the following steps:

Analysed the file using common Linux utilities file, binwalk, strings and hexdump
Made note that our firmware image is based on the ESP8266 and is likely performing a form of port knocking, prior to accessing secrets, based on the strings within.
Performed research, as well as reversed open source tools, to understand the hardware on which the firmware image runs, its processor, boot process and the memory layout, as well the firmware image format itself.
Equipped our tools with the appropriate additions to understand the Xtensa processor.
Written a loader for IDA that’s capable of loading future firmware images of this format.
Came to understand the format of compiled code prior to being exported as a firmware image.
Written and compiled our own code for the ESP8266 to obtain debugging symbols.
Patched and made use of FireEye’s IDB2PAT IDA plugin, to generate FLIRT signatures from our debug build.
Applied our FLIRT signatures across our target firmware image, to recognise library functions.
Observed the use of vtable’s to call library functions and used this to classify other unknown library functions.
Used references to functions of known and likely libraries to locate the firmware image’s main processing loop.
Reverse engineered the main loop function to understand our port knocking sequence.
Made use of the knock client to perform our port knocking and reap all of the secrets!

I’d like to think that this methodology can be applied more generally when analysing unknown binaries or firmware images. In this case, we were fortunate in that most of the internals had been documented already and as documented here, our job was to put the pieces together. I’d encourage the reader to look at other firmware images, such as router firmware for example.

References

Special thanks to the author’s of the following for their insight:

https://github.com/esp8266/Arduino/blob/master/libraries/ESP8266WiFi/examples/WiFiClient/WiFiClient.ino – ESP8266 Wifi Connect Example
https://richard.burtons.org/2015/05/17/decompiling-the-esp8266-boot-loader-v1-3b3/ – Decompiling the boot loader
https://github.com/nodemcu/nodemcu-firmware/tree/c8037568571edb5c568c2f8231e4f8ce0683b883 – NodeMCU tools
https://github.com/espressif/esptool – used to understand the firmware format
http://developers-club.com/posts/255135/ – describes memory layout and format
http://developers-club.com/posts/255153/ – describes memory layout and format
https://github.com/fireeye/flare-ida – FireEye’s IDB2PAT plugin!
https://github.com/esp8266/esp8266-wiki/wiki/Memory-Map – segment memory map!
https://en.wikipedia.org/wiki/ESP8266 – Description of the device
http://wiki.linux-xtensa.org/index.php/ABI_Interface – Xtensa calling convention
http://cholla.mmto.org/esp8266/xtensa.html – Xtensa instruction set
https://en.wikipedia.org/wiki/ESP8266 – ESP8266 Wiki page

Tools

IDA 6.8.
IDA FLAIR Utils 6.8.
Xtensa IDA processor plugin.
Linux utils: file, strings and hexdump.
Binwalk.
FireEye’s IDB2PAT.
Knock of the Knockd package.
Nmap.
cURL.

Feedback

I’m always keen to hear feedback, be it corrections or comments more generally. Drop me a tweet and feel free to share this post, as well as your own experiences reverse engineering firmware.

In future posts, I’ll be taking apart common, cheap ‘smart’ products such as doorbells and other things I’d like to use at home.