Reversing ESP8266 Firmware (Part 5)
Recognising VTABLE’s
After analysing our firmware image to some degree, it becomes clear that vtables are in use.
.int _ZN24BufferedStreamDataSourceI13ProgmemStreamE14release_bufferEPKhj ; BufferedStreamDataSource<ProgmemStream>::release_buffer(uchar const*,uint) .code_seg_0:4020B1E4 .int 0, 0, 0 .code_seg_0:4020B1F0 off_4020B1F0 .int _ZN10WiFiClient5writeEh .code_seg_0:4020B1F0 ; DATA XREF: .code_seg_0:off_4020804Co .code_seg_0:4020B1F0 ; WiFiClient::write(uchar) .code_seg_0:4020B1F4 .int _ZN10WiFiClient5writeEPKhj ; WiFiClient::write(uchar const*,uint) .code_seg_0:4020B1F8 .int loc_402082EF+1 .code_seg_0:4020B1FC .int sub_40207AA0 .code_seg_0:4020B200 .int _ZN10WiFiClient4readEv ; WiFiClient::read(void) .code_seg_0:4020B204 .int loc_4020AF27+1 .code_seg_0:4020B208 .int _ZN6Stream9readBytesEPcj ; Stream::readBytes(char *,uint) .code_seg_0:4020B20C .int _ZN6Stream9readBytesEPhj ; Stream::readBytes(uchar *,uint) .code_seg_0:4020B210 .int dword_40207F28+0x18 .code_seg_0:4020B214 .int sub_40207A58 .code_seg_0:4020B218 .int _ZN10WiFiClient4readEPhj ; WiFiClient::read(uchar *,uint) .code_seg_0:4020B21C .int _ZN10WiFiClient4stopEv ; WiFiClient::stop(void) .code_seg_0:4020B220 .int _ZN10WiFiClient9connectedEv ; WiFiClient::connected(void) .code_seg_0:4020B224 .int _ZN10WiFiClientcvbEv ; WiFiClient::operator bool(void) .code_seg_0:4020B228 .int _ZN10WiFiClientD2Ev ; WiFiClient::~WiFiClient() .code_seg_0:4020B22C .int sub_40208098 .code_seg_0:4020B230 .int _ZN10WiFiClient7connectE6Stringt ; WiFiClient::connect(String,ushort) .code_seg_0:4020B234 .int _ZN10WiFiClient7write_PEPKcj ; WiFiClient::write_P(char const*,uint) .code_seg_0:4020B238 .int sub_40207AD0 .code_seg_0:4020B23C .int 0, 0, 0 .code_seg_0:4020B248 .int _ZN6SdFile5writeEh ; SdFile::write(uchar) .code_seg_0:4020B24C .int _ZN5Print5writeEPKhj ; Print::write(uchar const*,uint) .code_seg_0:4020B250 .int sub_4020AFC4 .code_seg_0:4020B254 .int 0, 0, 0 .code_seg_0:4020B260 off_4020B260 .int loc_40209714 ; DATA XREF: .code_seg_0:off_402096C8o .code_seg_0:4020B264 .int loc_4020972C .code_seg_0:4020B268 .int dword_40209764+4 .code_seg_0:4020B26C .int loc_40209740 .code_seg_0:4020B270 .int loc_40209700 .code_seg_0:4020B274 .int loc_402096EC .code_seg_0:4020B278 .int _ZN6Stream9readBytesEPcj ; Stream::readBytes(char *,uint) .code_seg_0:4020B27C .int _ZN6Stream9readBytesEPhj ; Stream::readBytes(uchar *,uint)
A VTABLE in this context is essentially a collection of function pointers per each module of the application’s libraries. We can see that each library’s function pointers are delimited by three nullbytes, represented as the below for example:
[...] .code_seg_0:4020B234 .int _ZN10WiFiClient7write_PEPKcj ; WiFiClient::write_P(char const*,uint) .code_seg_0:4020B238 .int sub_40207AD0 .code_seg_0:4020B23C .int 0, 0, 0 .code_seg_0:4020B248 .int _ZN6SdFile5writeEh ; SdFile::write(uchar) [...]
Where we can observe two functions of the WiFiClient module, followed by three nullbytes (a delimiter) and finally, followed by the function pointers of the next module, in this case SdFile.
This is an important observation, as it will allow us to recognise if an unknown function belongs to a particular library, based on its presence within the VTABLEs amongst the other libraries. Given the below for example:
.code_seg_0:4020B234 .int _ZN10WiFiClient7write_PEPKcj ; WiFiClient::write_P(char const*,uint) .code_seg_0:4020B238 .int sub_40207AD0 .code_seg_0:4020B23C .int 0, 0, 0
We can infer that sub_40207AD0 whilst unnamed and unknown in terms of functionality, does in-fact belong to the WiFiClient library, which hints at its purpose.
Finding the port knock sequence
Armed with all of our obtained knowledge, at this point we’re in a position to find references to the connect() function of the WiFiClient library, or indeed to other functions, including those that are unnamed, in search of our port knocking sequence.
Having searched for references to connect(), I couldn’t find any. I did however, after checking for references to all functions that were part of the WiFiClient library, find a reference to the following unnamed function:
.code_seg_0:4020B214 .int sub_40207A58
The following XREFS were identified:
The function referenced appeared to be quite involved. You can see it below:
.code_seg_0:402073D0 sub_402073D0: ; CODE XREF: .code_seg_0:loc_40209F9Dp .code_seg_0:402073D0 addi a1, a1, 0xA0 .code_seg_0:402073D3 s32i a15, a1, 0x4C .code_seg_0:402073D6 l32r a15, dword_4020738C .code_seg_0:402073D9 s32i a0, a1, 0x5C .code_seg_0:402073DC mov.n a2, a15 .code_seg_0:402073DE s32i a12, a1, 0x58 .code_seg_0:402073E1 s32i a13, a1, 0x54 .code_seg_0:402073E4 s32i a14, a1, 0x50 .code_seg_0:402073E7 call0 delay .code_seg_0:402073EA l32r a2, dword_40207390 .code_seg_0:402073ED l32r a14, dword_402072B8 .code_seg_0:402073F0 l32i.n a3, a2, 0 .code_seg_0:402073F2 mov.n a13, a14 .code_seg_0:402073F4 addi.n a3, a3, 1 .code_seg_0:402073F6 s32i.n a3, a2, 0 .code_seg_0:402073F8 l32r a3, off_402072C0 .code_seg_0:402073FB mov.n a2, a14 .code_seg_0:402073FD call0 _ZN5Print5printEPKc ; Print::print(char const*) .code_seg_0:40207400 l32r a12, off_40207394 .code_seg_0:40207403 mov.n a2, a14 .code_seg_0:40207405 l32i.n a3, a12, 0 .code_seg_0:40207407 call0 _ZN5Print7printlnEPKc ; Print::println(char const*) .code_seg_0:4020740A mov.n a2, a1 .code_seg_0:4020740C call0 sub_40208454 .code_seg_0:4020740F l32i.n a3, a12, 0 .code_seg_0:40207411 movi a4, 0x539 .code_seg_0:40207414 mov.n a2, a1 .code_seg_0:40207416 call0 sub_40207A58 .code_seg_0:40207419 movi a2, 0x3E8 .code_seg_0:4020741C call0 delay .code_seg_0:4020741F l32i.n a3, a12, 0 .code_seg_0:40207421 l32r a4, dword_40207398 .code_seg_0:40207424 mov.n a2, a1 .code_seg_0:40207426 call0 sub_40207A58 .code_seg_0:40207429 movi a2, 0x3E8 .code_seg_0:4020742C call0 delay .code_seg_0:4020742F l32i.n a3, a12, 0 .code_seg_0:40207431 l32r a4, dword_4020739C .code_seg_0:40207434 mov.n a2, a1 .code_seg_0:40207436 call0 sub_40207A58 .code_seg_0:40207439 movi a2, 0x3E8 .code_seg_0:4020743C call0 delay .code_seg_0:4020743F l32i.n a3, a12, 0 .code_seg_0:40207441 l32r a4, dword_402073A0 .code_seg_0:40207444 mov.n a2, a1 .code_seg_0:40207446 call0 sub_40207A58 .code_seg_0:40207449 movi a2, 0x3E8 .code_seg_0:4020744C call0 delay .code_seg_0:4020744F l32i.n a3, a12, 0 .code_seg_0:40207451 mov.n a2, a1 .code_seg_0:40207453 movi a4, 0x1BD .code_seg_0:40207456 call0 sub_40207A58 .code_seg_0:40207459 bnez.n a2, loc_40207469 [...]
As an educated guess, we can assume this is probably the loop function of our image, which is responsible for doing most of the heavy leg work.
Understanding the Xtensa instruction set
In order to understand what the instructions above are doing, we want to have at least a passing familarity with what registers the processor uses and their purpose, as well as what common instructions do and how conditional jumps work.
It turns out someone has documented most of the common instructions of the Xtensa processor.
An excerpt of this guide, which covers loading and storing instructions, as well as register usage, is below:
This is a load/store machine with either 16 or 24 bit instructions. This leads to higher code density than with constand 32 bit encoding. Some instructions have optional “short” 16 bit encodings indicated by appending “.n” to the mnemonic. The Xtensa implements SPARC like register windows on subroutine calls, but I have never seen this feature used in either the bootrom or code generated by gcc, so this can be ignored.
There are 16 tegisters named a0 through a15.a0 is special – it holds the call return address.
a1 is used by gcc as a stack pointer.
a2 gets used to pass a single argument (and to return a function value).
Understanding the Xtensa calling convention
It would also be helpful to understand the calling convention, which describes how arguments are passed to function calls. I found this document, which describes the calling convention as follows:
Arguments are passed in both registers and memory. The first six incoming arguments are stored in registers a2 through a7, and additional arguments are stored on the stack starting at the current stack pointer a1. […].
Thus we can determine that registers a2 to a7, in most cases will be used to store arguments passed to functions. The register a2 is also used when passing a single argument to a function.