Reversing ESP8266 Firmware (Part 3)
What is it?
So, what is the ESP8266? Wikipedia describes it as follows:
The ESP8266 is a low-cost Wi-Fi microchip with full TCP/IP stack and microcontroller capability produced by Shanghai-based Chinese manufacturer, Espressif Systems.
Moreover, Wikipedia alludes to the processor specifics:
Processor: L106 32-bit RISC microprocessor core based on the Tensilica Xtensa Diamond Standard 106Micro running at 80 MHz”
At present, my version of IDA does not recognise this processor, but looking up “IDA Xtensa” unveils a processor module to support the instruction set, which is described as follows:
This is a processor plugin for IDA, to support the Xtensa core found in Espressif ESP8266.
With the above information, we’ve also answered our second question of “What is the processor?“.
Understanding the firmware format
Now that IDA can understand the instruction set of the processor, it’s time to learn how firmware images are comprised in terms of format, data and code. Indeed, what is the format of our firmware image?. To help answer this question, my first point of call was to analyse existing open source tools published by Expressif, in order to work with the ESP8266.
This leads us to ESPTool, an application written in Python capable of displaying some information about binary firmware images, amongst other things.
The manual for this tool also gives away some important information:
The elf2image command converts an ELF file (from compiler/linker output) into the binary executable images which can be flashed and then booted into.
From this, we can determine that compiled images, prior to their transformation into firmware, exist in the ELF-32 Xtensa format. This will be useful later on.
Moving back to the other features of ESPTool, we see it’s indeed able to present information about our firmware image:
josh@ioteeth:/tmp/reversing$ ~/esptool/esptool.py image_info recovered_file esptool.py v2.4.0-dev Image version: 1 Entry point: 4010f29c 1 segments Segment 1: len 0x00568 load 0x4010f000 file_offs 0x00000008 Checksum: 2d (valid)
Clearly, this application understands the format of an image, so let’s take it apart and see how it works.
Browsing through the code, we come across:
# Memory addresses IROM_MAP_START = 0x40200000 IROM_MAP_END = 0x40300000 [...] class ESPFirmwareImage(BaseFirmwareImage): """ 'Version 1' firmware image, segments loaded directly by the ROM bootloader. """ ROM_LOADER = ESP8266ROM def __init__(self, load_file=None): super(ESPFirmwareImage, self).__init__() self.flash_mode = 0 self.flash_size_freq = 0 self.version = 1 if load_file is not None: segments = self.load_common_header(load_file, ESPLoader.ESP_IMAGE_MAGIC) for _ in range(segments): self.load_segment(load_file) self.checksum = self.read_checksum(load_file) [...] class BaseFirmwareImage(object): SEG_HEADER_LEN = 8 """ Base class with common firmware image functions """ def __init__(self): self.segments = [] self.entrypoint = 0 def load_common_header(self, load_file, expected_magic): (magic, segments, self.flash_mode, self.flash_size_freq, self.entrypoint) = struct.unpack('<BBBBI', load_file.read(8)) if magic != expected_magic or segments > 16: raise FatalError('Invalid firmware image magic=%d segments=%d' % (magic, segments)) return segments def load_segment(self, f, is_irom_segment=False): """ Load the next segment from the image file """ file_offs = f.tell() (offset, size) = struct.unpack('<II', f.read(8)) self.warn_if_unusual_segment(offset, size, is_irom_segment) segment_data = f.read(size) if len(segment_data) < size: raise FatalError('End of file reading segment 0x%x, length %d (actual length %d)' % (offset, size, len(segment_data))) segment = ImageSegment(offset, segment_data, file_offs) self.segments.append(segment) return segment
All of the above code is notable. It allows us to discern the structure of the firmware image.
The function load_common_header() details the following format:
(magic, segments, self.flash_mode, self.flash_size_freq, self.entrypoint) = struct.unpack('<BBBBI', load_file.read(8))
Which represented as a structure would look like this:
typedef struct { uint8 magic; uint8 sect_count; uint8 flash_mode; uint8 flash_size_freq; uint32 entry_addr; } rom_header;
We can see from the function load_segment() that following our image header are the image segment headers, followed immediately by the segment data itself, for each segment.
The following code parses a segment header:
(offset, size) = struct.unpack('<II', f.read(8))
Which again, represented as a structure would be as follows:
typedef struct { uint32 seg_addr; uint32 seg_size; } segment_header;
This is helpful, we now know both the format of the firmware image and a number of the tools available to process such images. It’s worth noting that we haven’t considered elements such as checksums, but these aren’t important to us as we don’t intend on patching the firmware image.
Whilst a tangent, it’s worth noting that whilst in this case, our format has been documented and tools exist to parse such formats, often this is not the case. In such cases, I’d advise obtaining as many firmware images as you can from your target devices. At that point, a starting point could be to find commonalities between them, which could indicate what certain bytes mean within the format. Also of use would be to understand how an image is booted into, as the bootloader may act differently depending on certain values at fixed offsets.
Understanding the boot process
So, onto our next question, what is the boot process of the device? Understanding this is important as it will help to clarify our understanding of the image. Richard Aburton has very helpfully reverse engineered the boot loader and described the following key point:
It finds the flash address of the rom to boot. Rom 1 is always at 0×1000 (next sector after boot loader). Rom 2 is half the chip size + 0×1000 (unless the chip is above a 1mb when it’s position it kept down to to 0×81000).
Checking the 0×1000 offset within our firmware image, there is indeed a second image, as denoted by presence of the image magic signature (0xE9):
josh@ioteeth:/tmp/reversing$ hexdump -s 0x1000 -v -C recovered_file | head 00001000 e9 04 00 00 30 64 10 40 10 10 20 40 c0 ed 03 00 |....0d.@.. @....| 00001010 43 03 ab 83 1c 00 00 60 00 00 00 60 1c 0f 00 60 |C......`...`...`| 00001020 00 0f 00 60 41 fc ff 20 20 74 c0 20 00 32 24 00 |...`A.. t. .2$.| 00001030 30 30 75 56 33 ff 31 f8 ff 66 92 08 42 a0 0d c0 |00uV3.1..f..B...| 00001040 20 00 42 63 00 51 f5 ff c0 20 00 29 03 42 a0 7d | .Bc.Q... .).B.}| 00001050 c0 20 00 38 05 30 30 75 37 34 f4 31 f1 ff 66 92 |. .8.00u74.1..f.| 00001060 06 0c d4 c0 20 00 49 03 c0 20 00 29 03 0d f0 00 |.... .I.. .)....| 00001070 b0 ff ff 3f 24 10 20 40 00 ed fe 3f 80 6e 10 40 |...?$. @...?.n.@| 00001080 04 ed fe 3f 79 6e 10 40 fc ec fe 3f f8 ec fe 3f |...?yn.@...?...?| 00001090 6b 6e 10 40 61 6e 10 40 f6 ec fe 3f 52 6e 10 40 |kn.@an.@...?Rn.@|
This second firmware image sits almost immediately after the padding bytes we observed earlier. Based on the format, we can see from the second byte (0×04) that this ROM has 4 segments and is likely to be user or custom ROM code, with the first ROM image potentially being the bootloader of the device, responsible for bootstrapping.
Whilst there are a lot of nuances to the boot process, the above is all we really need to be aware of at this time.
Understanding the physical memory layout
Next, understanding the physical memory layout will help us to differentiate between data and code segments, assuming consistency between images. Whilst not entirely accurate, the physical memory layout of the ESP8266 has been documented.
From the information within, we can conclude the following:
- 0×40100000 – Instruction RAM. Used by bootloader to load SPI Flash <40000h.
- 0x3FFE8000 – User data RAM. Available to applications.
- 0x3FFFFFFF – Anything below this address appears to be data, not code
- 0×40100000 – Anything above this address appears to be code, not data
Anything that doesn’t match an address exactly, we’ll mark as unknown and classify as either code or data based on the rules above.
It should be noted that simply loading the file as ‘binary’ within IDA, having set the appropriate processor, allows for limited understanding and doesn’t display any xrefs to strings that could guide our efforts:
With this in mind, we can write a simple loader for IDA to identify the firmware image and load the segments accordingly, which should yield better results. We’ll use the memory map above as a guide to name the segments and mark them as code or data accordingly.