Reversing ESP8266 Firmware (Part 6)
At this point we’re actually reversing ESP8266 firmware to understand the functionality, specifically, we’d like to understand what the loop function does, which is the main entry point once booted.
Reversing the loop function
I’ve analysed and commented the assembly below to detail guessed ports, functions and hostnames:
.code_seg_0:402073D0 loop: ; CODE XREF: .code_seg_0:loc_40209F9Dp .code_seg_0:402073D0 addi a1, a1, 0xA0 .code_seg_0:402073D3 s32i a15, a1, 0x4C .code_seg_0:402073D6 l32r a15, p_5000 .code_seg_0:402073D9 s32i a0, a1, 0x5C .code_seg_0:402073DC mov.n a2, a15 ; wait 5 seconds .code_seg_0:402073DE s32i a12, a1, 0x58 .code_seg_0:402073E1 s32i a13, a1, 0x54 .code_seg_0:402073E4 s32i a14, a1, 0x50 .code_seg_0:402073E7 call0 delay .code_seg_0:402073EA l32r a2, dword_40207390 .code_seg_0:402073ED l32r a14, dword_402072B8 .code_seg_0:402073F0 l32i.n a3, a2, 0 .code_seg_0:402073F2 mov.n a13, a14 .code_seg_0:402073F4 addi.n a3, a3, 1 .code_seg_0:402073F6 s32i.n a3, a2, 0 .code_seg_0:402073F8 l32r a3, off_402072C0 .code_seg_0:402073FB mov.n a2, a14 .code_seg_0:402073FD call0 _ZN5Print5printEPKc ; Print::print(char const*) .code_seg_0:40207400 l32r a12, hostname .code_seg_0:40207403 mov.n a2, a14 .code_seg_0:40207405 l32i.n a3, a12, 0 .code_seg_0:40207407 call0 _ZN5Print7printlnEPKc ; Print::println(char const*) .code_seg_0:4020740A mov.n a2, a1 .code_seg_0:4020740C call0 sub_40208454 .code_seg_0:4020740F l32i.n a3, a12, 0 .code_seg_0:40207411 movi a4, 1337 ; connect to port 1337 .code_seg_0:40207414 mov.n a2, a1 .code_seg_0:40207416 call0 guessed_connect .code_seg_0:40207419 movi a2, 1000 .code_seg_0:4020741C call0 delay ; wait 1000ms before next connection attempt .code_seg_0:4020741F l32i.n a3, a12, 0 .code_seg_0:40207421 l32r a4, port_8000 .code_seg_0:40207424 mov.n a2, a1 .code_seg_0:40207426 call0 guessed_connect ; connect to port 8000 .code_seg_0:40207429 movi a2, 1000 .code_seg_0:4020742C call0 delay .code_seg_0:4020742F l32i.n a3, a12, 0 .code_seg_0:40207431 l32r a4, port_3306 .code_seg_0:40207434 mov.n a2, a1 .code_seg_0:40207436 call0 guessed_connect ; connect to port 3306 .code_seg_0:40207439 movi a2, 1000 .code_seg_0:4020743C call0 delay .code_seg_0:4020743F l32i.n a3, a12, 0 .code_seg_0:40207441 l32r a4, port_4545 .code_seg_0:40207444 mov.n a2, a1 .code_seg_0:40207446 call0 guessed_connect ; connect to port 4545 .code_seg_0:40207449 movi a2, 1000 .code_seg_0:4020744C call0 delay .code_seg_0:4020744F l32i.n a3, a12, 0 .code_seg_0:40207451 mov.n a2, a1 .code_seg_0:40207453 movi a4, 445 ; our final port! .code_seg_0:40207456 call0 guessed_connect .code_seg_0:40207459 bnez.n a2, loc_40207469
From the above, we’ve determined that:
.code_seg_0:40207400 l32r a12, hostname
Is loading a pointer to the hostname variable into the a12 register. This is followed by loading of what looks like a port number into various other registers, again followed by a call0 instruction. This behaviour led me to guess this is likely our connect() function.
From this analysis, we’ve determined our port knocking sequence to be as follows:
- 1337
- 8000
- 3306
- 4545
With the application connecting predictably, to on port 445.
With that, we’ve effectively solved the challenge! All that’s left is to get the secrets!
Getting the secrets!
In order to obtain the secrets, we need to knock on the now known ports in the correct order. We can do this in various ways, using nmap or even netcat, but I prefer to use the knock binary, as it’s purpose built (and is part of the knockd package).
josh@ioteeth:/tmp$ knock -v 1337 8000 3306 4545 hitting tcp hitting tcp hitting tcp hitting tcp josh@ioteeth:/tmp$ nmap -n -PN -F -v -oN Warning: The -PN option is deprecated. Please use -Pn Starting Nmap 7.40 ( ) at 2018-05-25 11:55 BST Initiating Connect Scan at 11:55 Scanning ( [100 ports] Discovered open port 445/tcp on Discovered open port 22/tcp on Completed Connect Scan at 11:55, 0.00s elapsed (100 total ports) Nmap scan report for ( Host is up (0.0012s latency). Not shown: 98 closed ports PORT STATE SERVICE 22/tcp open ssh 445/tcp open microsoft-ds Read data files from: /usr/bin/../share/nmap Nmap done: 1 IP address (1 host up) scanned in 0.03 seconds
Accessing the service, we receive the following:
josh@ioteeth:/tmp$ curl Well done! We hope you had fun with this challenge and learned a lot! flag{esp8266_reversing_is_awes0me}
In this post, we set out to understand how a particular firmware image communicated with external services to apparently obtain secrets. We knew nothing about the firmware initially and wanted to describe a methodology for analysing unknown formats.
Ultimately we’ve taken the following steps:
- Analysed the file using common Linux utilities file, binwalk, strings and hexdump
- Made note that our firmware image is based on the ESP8266 and is likely performing a form of port knocking, prior to accessing secrets, based on the strings within.
- Performed research, as well as reversed open source tools, to understand the hardware on which the firmware image runs, its processor, boot process and the memory layout, as well the firmware image format itself.
- Equipped our tools with the appropriate additions to understand the Xtensa processor.
- Written a loader for IDA that’s capable of loading future firmware images of this format.
- Came to understand the format of compiled code prior to being exported as a firmware image.
- Written and compiled our own code for the ESP8266 to obtain debugging symbols.
- Patched and made use of FireEye’s IDB2PAT IDA plugin, to generate FLIRT signatures from our debug build.
- Applied our FLIRT signatures across our target firmware image, to recognise library functions.
- Observed the use of vtable’s to call library functions and used this to classify other unknown library functions.
- Used references to functions of known and likely libraries to locate the firmware image’s main processing loop.
- Reverse engineered the main loop function to understand our port knocking sequence.
- Made use of the knock client to perform our port knocking and reap all of the secrets!
I’d like to think that this methodology can be applied more generally when analysing unknown binaries or firmware images. In this case, we were fortunate in that most of the internals had been documented already and as documented here, our job was to put the pieces together. I’d encourage the reader to look at other firmware images, such as router firmware for example.
Special thanks to the author’s of the following for their insight:
- – ESP8266 Wifi Connect Example
- – Decompiling the boot loader
- – NodeMCU tools
- – used to understand the firmware format
- – describes memory layout and format
- – describes memory layout and format
- – FireEye’s IDB2PAT plugin!
- – segment memory map!
- – Description of the device
- – Xtensa calling convention
- – Xtensa instruction set
- – ESP8266 Wiki page
- IDA 6.8.
- IDA FLAIR Utils 6.8.
- Xtensa IDA processor plugin.
- Linux utils: file, strings and hexdump.
- Binwalk.
- FireEye’s IDB2PAT.
- Knock of the Knockd package.
- Nmap.
- cURL.
I’m always keen to hear feedback, be it corrections or comments more generally. Drop me a tweet and feel free to share this post, as well as your own experiences reverse engineering firmware.
In future posts, I’ll be taking apart common, cheap ‘smart’ products such as doorbells and other things I’d like to use at home.