25th October 2018 / Reverse Engineering

Reversing ESP8266 Firmware (Part 3)

What is it?

So, what is the ESP8266? Wikipedia describes it as follows:

The ESP8266 is a low-cost Wi-Fi microchip with full TCP/IP stack and microcontroller capability produced by Shanghai-based Chinese manufacturer, Espressif Systems.

Moreover, Wikipedia alludes to the processor specifics:

Processor: L106 32-bit RISC microprocessor core based on the Tensilica Xtensa Diamond Standard 106Micro running at 80 MHz”

At present, my version of IDA does not recognise this processor, but looking up “IDA Xtensa” unveils a processor module to support the instruction set, which is described as follows:

This is a processor plugin for IDA, to support the Xtensa core found in Espressif ESP8266.

With the above information, we’ve also answered our second question of “What is the processor?“.

Understanding the firmware format

Now that IDA can understand the instruction set of the processor, it’s time to learn how firmware images are comprised in terms of format, data and code. Indeed, what is the format of our firmware image?. To help answer this question, my first point of call was to analyse existing open source tools published by Expressif, in order to work with the ESP8266.

This leads us to ESPTool, an application written in Python capable of displaying some information about binary firmware images, amongst other things.

The manual for this tool also gives away some important information:

The elf2image command converts an ELF file (from compiler/linker output) into the binary executable images which can be flashed and then booted into.

From this, we can determine that compiled images, prior to their transformation into firmware, exist in the ELF-32 Xtensa format. This will be useful later on.

Moving back to the other features of ESPTool, we see it’s indeed able to present information about our firmware image:

josh@ioteeth:/tmp/reversing$ ~/esptool/esptool.py image_info recovered_file
esptool.py v2.4.0-dev
Image version: 1
Entry point: 4010f29c
1 segments
Segment 1: len 0x00568 load 0x4010f000 file_offs 0x00000008
Checksum: 2d (valid)

Clearly, this application understands the format of an image, so let’s take it apart and see how it works.

Browsing through the code, we come across:

# Memory addresses
IROM_MAP_START = 0x40200000
IROM_MAP_END = 0x40300000

[...]

class ESPFirmwareImage(BaseFirmwareImage):
    """ 'Version 1' firmware image, segments loaded directly by the ROM bootloader. """

    ROM_LOADER = ESP8266ROM

    def __init__(self, load_file=None):
        super(ESPFirmwareImage, self).__init__()
        self.flash_mode = 0
        self.flash_size_freq = 0
        self.version = 1

        if load_file is not None:
            segments = self.load_common_header(load_file, ESPLoader.ESP_IMAGE_MAGIC)

            for _ in range(segments):
                self.load_segment(load_file)
            self.checksum = self.read_checksum(load_file)

[...]

class BaseFirmwareImage(object):
    SEG_HEADER_LEN = 8

    """ Base class with common firmware image functions """
    def __init__(self):
        self.segments = []
        self.entrypoint = 0

    def load_common_header(self, load_file, expected_magic):
            (magic, segments, self.flash_mode, self.flash_size_freq, self.entrypoint) = struct.unpack('<BBBBI', load_file.read(8))

            if magic != expected_magic or segments > 16:
                raise FatalError('Invalid firmware image magic=%d segments=%d' % (magic, segments))
            return segments

    def load_segment(self, f, is_irom_segment=False):
        """ Load the next segment from the image file """
        file_offs = f.tell()
        (offset, size) = struct.unpack('<II', f.read(8))
        self.warn_if_unusual_segment(offset, size, is_irom_segment)
        segment_data = f.read(size)
        if len(segment_data) < size:
            raise FatalError('End of file reading segment 0x%x, length %d (actual length %d)' % (offset, size, len(segment_data)))
        segment = ImageSegment(offset, segment_data, file_offs)
        self.segments.append(segment)
        return segment

All of the above code is notable. It allows us to discern the structure of the firmware image.

The function load_common_header() details the following format:

(magic, segments, self.flash_mode, self.flash_size_freq, self.entrypoint) = struct.unpack('<BBBBI', load_file.read(8))

Which represented as a structure would look like this:

typedef struct {
    uint8 magic;
    uint8 sect_count;
    uint8 flash_mode;
    uint8 flash_size_freq;
    uint32 entry_addr;
} rom_header;

We can see from the function load_segment() that following our image header are the image segment headers, followed immediately by the segment data itself, for each segment.

The following code parses a segment header:

(offset, size) = struct.unpack('<II', f.read(8))

Which again, represented as a structure would be as follows:

typedef struct {
    uint32 seg_addr;
    uint32 seg_size;
} segment_header;

This is helpful, we now know both the format of the firmware image and a number of the tools available to process such images. It’s worth noting that we haven’t considered elements such as checksums, but these aren’t important to us as we don’t intend on patching the firmware image.

Whilst a tangent, it’s worth noting that whilst in this case, our format has been documented and tools exist to parse such formats, often this is not the case. In such cases, I’d advise obtaining as many firmware images as you can from your target devices. At that point, a starting point could be to find commonalities between them, which could indicate what certain bytes mean within the format. Also of use would be to understand how an image is booted into, as the bootloader may act differently depending on certain values at fixed offsets.

Understanding the boot process

So, onto our next question, what is the boot process of the device? Understanding this is important as it will help to clarify our understanding of the image. Richard Aburton has very helpfully reverse engineered the boot loader and described the following key point:

It finds the flash address of the rom to boot. Rom 1 is always at 0×1000 (next sector after boot loader). Rom 2 is half the chip size + 0×1000 (unless the chip is above a 1mb when it’s position it kept down to to 0×81000).

Checking the 0×1000 offset within our firmware image, there is indeed a second image, as denoted by presence of the image magic signature (0xE9):

josh@ioteeth:/tmp/reversing$ hexdump -s 0x1000 -v -C recovered_file | head 
00001000  e9 04 00 00 30 64 10 40  10 10 20 40 c0 ed 03 00  |....0d.@.. @....|
00001010  43 03 ab 83 1c 00 00 60  00 00 00 60 1c 0f 00 60  |C......`...`...`|
00001020  00 0f 00 60 41 fc ff 20  20 74 c0 20 00 32 24 00  |...`A..  t. .2$.|
00001030  30 30 75 56 33 ff 31 f8  ff 66 92 08 42 a0 0d c0  |00uV3.1..f..B...|
00001040  20 00 42 63 00 51 f5 ff  c0 20 00 29 03 42 a0 7d  | .Bc.Q... .).B.}|
00001050  c0 20 00 38 05 30 30 75  37 34 f4 31 f1 ff 66 92  |. .8.00u74.1..f.|
00001060  06 0c d4 c0 20 00 49 03  c0 20 00 29 03 0d f0 00  |.... .I.. .)....|
00001070  b0 ff ff 3f 24 10 20 40  00 ed fe 3f 80 6e 10 40  |...?$. @...?.n.@|
00001080  04 ed fe 3f 79 6e 10 40  fc ec fe 3f f8 ec fe 3f  |...?yn.@...?...?|
00001090  6b 6e 10 40 61 6e 10 40  f6 ec fe 3f 52 6e 10 40  |kn.@an.@...?Rn.@|

This second firmware image sits almost immediately after the padding bytes we observed earlier. Based on the format, we can see from the second byte (0×04) that this ROM has 4 segments and is likely to be user or custom ROM code, with the first ROM image potentially being the bootloader of the device, responsible for bootstrapping.

Whilst there are a lot of nuances to the boot process, the above is all we really need to be aware of at this time.

Understanding the physical memory layout

Next, understanding the physical memory layout will help us to differentiate between data and code segments, assuming consistency between images. Whilst not entirely accurate, the physical memory layout of the ESP8266 has been documented.

From the information within, we can conclude the following:

0×40100000 – Instruction RAM. Used by bootloader to load SPI Flash <40000h.
0x3FFE8000 – User data RAM. Available to applications.
0x3FFFFFFF – Anything below this address appears to be data, not code
0×40100000 – Anything above this address appears to be code, not data

Anything that doesn’t match an address exactly, we’ll mark as unknown and classify as either code or data based on the rules above.

It should be noted that simply loading the file as ‘binary’ within IDA, having set the appropriate processor, allows for limited understanding and doesn’t display any xrefs to strings that could guide our efforts:

With this in mind, we can write a simple loader for IDA to identify the firmware image and load the segments accordingly, which should yield better results. We’ll use the memory map above as a guide to name the segments and mark them as code or data accordingly.

What is it?

Understanding the firmware format

Understanding the boot process

Understanding the physical memory layout

Leave a Reply Cancel reply