29 Nov 2019 - tsp
Last update 11 Dec 2019
23 mins
Disclaimer: There is no guarantee for this information to be complete (it’s highly likely that there is something missing out). It’s just enough to build the applications that I’m associated with or do on my own and has been reconstructed by reading information from various sources. This post does in no way claims to be correct (even though I personally think for myself it is).
The following post emerged out of the curiosity on what the compilcated Makefiles and scripts of the ESP8266 SDK really do during the build process of an ESP8266 (or ESP32) application. Suprisingly (or in reality not so suprisingly) there is not much magic in the whole process. Basically it’s just a bunch of steps to collect all binary data sections that are flashed into the flash memory.
If one wants to do fast and simple experiments with ESP8266 using a finished prototyping board like the NodeMCU Amica that’s based on the ESP-12E component board is a nice and fast solution (note: Link is an Amazon affilate link, this pages author profits from purchases)
There are two methods supported by the SDK dependent on the decision if the FOTA bootloader of espressif is used or not. The FOTA bootloader is the easiest way of providing over the air firmware upgrades. It periodically polls espressifs cloud, checks if there is a new firmware revision available, downloads it and flashes it into the flash memory. The bootloader then selects the specific image to load, shadows the required regions into RAM, performs the mapping of flash into the adress space and executes the code.
On the other hand one can decide to not use espressifs FOTA bootloader - or
use an own bootloader. When not using the bootloader there will be two
files generated by the build process. These are eagle.flash.bin
which
will get copied into instruction RAM by the (unmodifyable) ROM bootloader on
bootup and the irom0text.bin
, which is just mapped into the adress space
but not copied into RAM. Note that irom0text.bin
has nothing to do with
the real ROM of the ESP8266. This flash section might be way larger than the
RAM region - but accesses are about 10 times slower and all content is read
only. It’s the perfect region to fit string constants, etc. that are not
required very often - or non timing critical functions.
One might also use a different bootloader (for example an own bootloader,
arduinos eboot
or one of the other available bootloaders). In this
case the basic idea of building flash images is the same - linker parameters
and adresses will be differen though.
It’s also possible to use remaining flash regions for other stuff like SPIFFS
partitions or other filesystems. Do not forget to register them inside the
partition table that gets passed to system_partition_table_regist
inside your user_pre_init
callback - if you don’t supply a correct
partition table layout the ROM library might run havok or enter some kind
of undefined behaviour.
In this case the build process is straight forward:
user
directory
and pack them into a library libuser.a
driver
directory and pack them into
a libdriver.a
objdump
objcopy
nm
(this is already
done inside the gen_appbin.py
Python script)gen_appbin.py
file locates all required
symbols and generates flash image file headers. After that all headers
and binary section data is concatenated into a single binary file (one
to be copied into the iram0
section, one to be mapped into the
address space, the irom0
section).To build user and driver code the gcc
compiler from the xtensa toolchain
is used. For example to build user_main.c
and store it in
the .output/eagle/debug/obj/user_main.o
subdirectory of the user
folder the toolchain calls
xtensa-lx106-elf-gcc
-Os
-g
-Wpointer-arith
-Wundef
-Wl,-EL
-fno-inline-functions
-nostdlib
-mlongcalls
-mtext-section-literals
-ffunction-sections
-fdata-sections
-fno-builtin-printf
-fno-guess-branch-probability
-freorder-blocks-and-partition
-fno-cse-follow-jumps
-DICACHE_FLASH
-DSPI_FLASH_SIZE_MAP=6
-I include -I ./ -I ../../include/ets -I ../include
-I ../../include -I ../../include/eagle -I ../../driver_lib/include
-o .output/eagle/debug/obj/user_main.o
-c
user_main.c
The serve the following purpose:
-Os
enabled the optimizer and instructs it to optimize for size instead
of speed-g
embedds debug data into the generated binary-Wpointer-arith
warns about sizeof usage thats undefined according to
the C specification (like sizeof a function type or a void object). In C++
mode it also warns about calculations involving NULL
. Note that these
operations violate the C/C++ specifications so they shouldn’t be used anyways.-Wundef
warns whenever an undefined identifier is located inside
a conditional preprocessor statement like #if
.-Wl,-EL
selects little endian output for the linker.-fno-inline-functions
disabled automatic inlineling by the optimizer
which is normally only done with -O2
or higher-nostdlib
disables the usage of the standard c library-mlongcalls
will translate direct calls into indirect calls at the
assembly stage in case it cannot be guaranteed that the call target is in
range of the call.-mtext-section-literals
instructs the compiler to put literals
into the text section (and not into other constant sections) to keep
references as local as possible. Literals get moved into the vicinity
of functions that reference them whenever possible.-ffunction-sections
enabled the generation of a separate function
section for each function inside the generated code. This allows dead code
eliminiation (DCE) to remove all unnecessary functions that are never
called or references.-fdata-sections
enabled the generation of a separate data section for
each global variable in the sourcefile. This allows the linker to discard
variables that will never get references (like dead code elimination).-fno-builtin-printf
disables the optimizer to translate printf
statements into more direct output functions. In case the compiler knows the
used C library this can reduce the amount of parsing of the pattern string.
This is always required when one replaces the standard printf function or
doesn’t use the standard C library.-fno-guess-branch-probability
disabled the static branch predition optimizer.-freorder-blocks-and-partition
instructs the optimizer to try to re-arange
code blocks to keep jumps more locally. To do that code is separated into hot
and cold basic blocks that get rearranged inside their sections appropriatly.-fno-cse-follow-jumps
prevents the common subexpression elimination (CSE)
optimizer to follow jumps into different code regions.-DICACHE_FLASH
-DSPI_FLASH_SIZE_MAP=6
passes a preprocessor definition for the SPI
flash size map that’s used. 6 would select the 4096 MByte flash map
supported by the espressif SDK.-I
adds include directories to the preprocessors search path.-o
supplies to output object file to write into.This call generates the ELF object files for each and every source file (separately).
After that all object files get packed into a single object file archive
by using ar
:
xtensa-lx106-elf-ar ru .output/eagle/debug/lib/libuser.a .output/eagle/debug/obj/user_main.o
In this case the user_main.o
gets inserted or replaced (r
) into the
archive libuser.a
. To insert only files that are newer than existing members
of the archive file the u
flag gets supplied. This method is selected to
keep the same libuser.a
over sucessive build steps and only update
changed files. Of course that means that - without a clean operation - old
object files are kept inside the archives. These do not end up inside the
final binary due to dead code and unreferenced section removal by the linker
later on.
Driver code is generated exactly the same way. There is no real difference between drivers and user code anyways. This is just a decision made by the SDK authors.
In the next step both driver and user object file libraries as well as all required runtime libraries supplied by the SDK will be linked into a single ELF object file.
This normally ends up inside the .output
subdirectroy relative to the
applications root directory.
xtensa-lx106-elf-gcc
-L../lib -nostdlib
-T../ld/eagle.app.v6.ld
-Wl,--no-check-sections
-Wl,--gc-sections
-u call_user_start
-Wl,-static
-Wl,--start-group
-lc -lgcc -lhal -lphy -lpp -lnet80211
-llwip -lwpa -lcrypto -lmain -ljson
-lupgrade -lssl -lpwm -lsmartconfig
user/.output/eagle/debug/lib/libuser.a
driver/.output/eagle/debug/lib/libdriver.a
-Wl,--end-group
-o .output/eagle/debug/image/eagle.app.v6.out
-L
supplies additional search paths for libraries-nostdlib
prevents linking against the standaard C library-T
is the most essential option here. It supplies the linker script that
will be discussed below.-Wl,--no-check-sections
prevents the linker to check assigned adresses
to prevent overlaps. Since a custom linker script is used and there is some
potential desired overlap this option is used.-Wl,--gc-sections
enables garbage collection of sections. All sections
that are neither directly nor indirectly references from the entry point
will get discarded. The linker follows all references to other sections to
build a reference graph and discards all sections that cannot be traced back
to the section containing the entry point. In this step all unnecessary
libraries and functions as well as variables and constants get discarded.-u call_user_start
requires the symbol call_user_start
to be
entered as undefined into the produced ELF object file.-Wl,-static
disables linking against shared libraries that are obviously
not supported on microcontrollers.-Wl,--start-group
and -Wl,--end-group
defines that all libraries
referenced inside the group get searched repeatedly (i.e. in multiple passes).
Normally references are resolved only in order and once which would prevent
for example two library modules referencing each othe (i.e. references could
only go into one direction).-lc
, -lgcc
, -lhal
, -lphy
, -lpp
, -lnet80211
, -llwip
, -lwpa
, -lcrypto
, -lmain
, -ljson
, -lupgrade
, -lssl
, -lpwm
, -lsmartconfig
add the respective binary libraries supplied with the SDK to the object file.
Because the linker runs garbage collection on sections only the library objects
that are really used are included.This produces the ELF opject file binary eagle.app.v6.out
that only contains
required sections.
The linker script used above depends on the flash configuration and usage of
bootloader. In case eagle.app.v6.ld
is used the boot_v1.2+
bootloader
mode is used with non-FOTA and 4096KB(1024KB+1024KB)
SPI size and mapping.
To get a fast overview of the memory map used by the ESP8266 one might take a look at the ESP8266 memory map.
When one looks into the linker script one can discover that there are 4 defined regions:
MEMORY
{
dport0_0_seg : org = 0x3FF00000, len = 0x10
dram0_0_seg : org = 0x3FFE8000, len = 0x14000
iram1_0_seg : org = 0x40100000, len = 0x8000
irom0_0_seg : org = 0x40210000, len = 0x5C000
}
dport0_0_seg
is the memory mapped I/O regiondram0_0_seg
is user data RAM available to user applicationsiram1_0_seg
is the instruction RAM section used by the bootloader to
load flash memory < 40000h
into RAM.irom0_0_seg
contains all code that won’t get copied into RAM on boot
but will get mapped into the address space (read only)Next ELF program headers are generated for all segments as well as the dram0 bss section (bss sections contain variables that are not initialized during load and so don’t have to be read into memory or saved in flash before accessing them).
PHDRS
{
dport0_0_phdr PT_LOAD;
dram0_0_phdr PT_LOAD;
dram0_0_bss_phdr PT_LOAD;
iram1_0_phdr PT_LOAD;
irom0_0_phdr PT_LOAD;
}
The PT_LOAD
command instructs the ELF loader that these segments have
to be loaded from the file. ELF would be capable of providing additional information
like for example notes, dynamic linking information, name of the used linker, etc.
that won’t we used in this linker script.
After that the entry point call_user_start
as well as the five exception
vectors are defined:
ENTRY(call_user_start)
EXTERN(_DebugExceptionVector)
EXTERN(_DoubleExceptionVector)
EXTERN(_KernelExceptionVector)
EXTERN(_NMIExceptionVector)
EXTERN(_UserExceptionVector)
PROVIDE(_memmap_vecbase_reset = 0x40000000);
The definition as EXTERN
tells the linker that these symbols will enter
the resulting object file as undefined (they will be linked by some other
tool later on).
The PROVIDE
statements throughout the file include additional symbols into
the symbol table that are not defined inside the input object files. This is
done so they don’t have to be defined inside the source files - and because
they change depending on the flash map configuration. As one can see the
reset vector points into the internal boot ROM of the ESP8266 (thats not
modifyable).
Next follows the cache configuration. This tells which regions get mapped
with write-back (wb
) or write-through (wt
) strategies:
_memmap_cacheattr_wb_base = 0x00000110;
_memmap_cacheattr_wt_base = 0x00000110;
_memmap_cacheattr_bp_base = 0x00000220;
_memmap_cacheattr_unused_mask = 0xFFFFF00F;
_memmap_cacheattr_wb_trapnull = 0x2222211F;
_memmap_cacheattr_wba_trapnull = 0x2222211F;
_memmap_cacheattr_wbna_trapnull = 0x2222211F;
_memmap_cacheattr_wt_trapnull = 0x2222211F;
_memmap_cacheattr_bp_trapnull = 0x2222222F;
_memmap_cacheattr_wb_strict = 0xFFFFF11F;
_memmap_cacheattr_wt_strict = 0xFFFFF11F;
_memmap_cacheattr_bp_strict = 0xFFFFF22F;
_memmap_cacheattr_wb_allvalid = 0x22222112;
_memmap_cacheattr_wt_allvalid = 0x22222112;
_memmap_cacheattr_bp_allvalid = 0x22222222;
PROVIDE(_memmap_cacheattr_reset = _memmap_cacheattr_wb_trapnull);
Note that these values are again highly dependent on the used flash memory map.
After that sections get mapped into the segments that got defined at the
beginning. Each entry of SECTIONS
defines an output section. As one
can see can see inside the linker script the output sections get
assembled from the various input sections. For example
.irom0.text : ALIGN(4)
{
_irom0_text_start = ABSOLUTE(.);
*libat.a:(.literal.* .text.*)
*libcrypto.a:(.literal.* .text.*)
*libespnow.a:(.literal.* .text.*)
*libjson.a:(.literal.* .text.*)
*liblwip.a:(.literal.* .text.*)
*libnet80211.a:(.literal.* .text.*)
*libsmartconfig.a:(.literal.* .text.*)
*libssl.a:(.literal.* .text.*)
*libupgrade.a:(.literal.* .text.*)
*libwpa.a:(.literal.* .text.*)
*libwpa2.a:(.literal.* .text.*)
*libwps.a:(.literal.* .text.*)
*libmbedtls.a:(.literal.* .text.*)
*libm.a:(.literal .text .literal.* .text.*)
*(.irom0.literal .irom.literal .irom.text.literal .irom0.text .irom.text)
_irom0_text_end = ABSOLUTE(.);
} >irom0_0_seg :irom0_0_phdr
assigns the literal
and text
sections from all linked libraries
into the irom0.text
sections in the specified order. At the end all
sections that got assigned to irom0.literal
, irom0.text
, etc. get
also merged into this section. The _irom0_text_end = ABSOLUTE(.)
command
assigns the absolute position of _irom0_text_end
instead of using
realtive positioning. As one can also see the section gets aligned to a 32 bit
boundary.
At the end of the linker script another linker script eagle.rom.addr.v6.ld
gets included. This provides the absoluve adresses of all functions that
should be callable inside the non modifyable ROM area. For example
PROVIDE ( SHA1Final = 0x4000b648 );
PROVIDE ( SHA1Init = 0x4000b584 );
PROVIDE ( SHA1Transform = 0x4000a364 );
PROVIDE ( SHA1Update = 0x4000b5a8 );
Just craete the four linker symbols SHA1Final
, SHA1Init
, SHA1Transform
and SHA1Update
and provide the absolute adresses of these functions
that reference into the internal boot ROM.
In the following step the sections that make up the image file get extracted
from the ELF object file using objcopy
. With the help of objdump
and assembly language dump as well as a dump file gets generated. These files
are then used by the gen_appbin
python script to build the image file
itself.
xtensa-lx106-elf-objdump -x -s .output/eagle/debug/image/eagle.app.v6.out > ../bin/eagle.dump
xtensa-lx106-elf-objdump -S .output/eagle/debug/image/eagle.app.v6.out > ../bin/eagle.S
xtensa-lx106-elf-objcopy --only-section .text -O binary .output/eagle/debug/image/eagle.app.v6.out eagle.app.v6.text.bin
xtensa-lx106-elf-objcopy --only-section .data -O binary .output/eagle/debug/image/eagle.app.v6.out eagle.app.v6.data.bin
xtensa-lx106-elf-objcopy --only-section .rodata -O binary .output/eagle/debug/image/eagle.app.v6.out eagle.app.v6.rodata.bin
xtensa-lx106-elf-objcopy --only-section .irom0.text -O binary .output/eagle/debug/image/eagle.app.v6.out eagle.app.v6.irom0text.bin
As one can see this simply dumps the sections previously defined in the linker scripts.
eagle.app.flash.bin
This step is the most compilcated. The Makefiles only call a small Python script:
python ../tools/gen_appbin.py .output/eagle/debug/image/eagle.app.v6.out 0 0 0 6 0
eagle.app.v6.text.bin
, etc.) have been
fixed inside the python script.user_bin
The python script itself first extracts all symbols from the ELF file:
xtensa-lx106-elf-nm -g .output/eagle/debug/image/eagle.app.v6.out > eagle.app.sym
This symbol file contains just a list of linker calculated adresses as well as the symbolic names. For example the dump used to write this example contains
40101300 T Cache_Read_Disable_2
40004678 A Cache_Read_Enable
40101340 T Cache_Read_Enable_2
401001b0 T Cache_Read_Enable_New
40100004 T call_user_start
40100254 T call_user_start_local
for the call_user_start
symbol (that points into instruction RAM). This
symbol file is used by the script to locate three symbols:
call_user_start
_data_start
_rodata_start
Dependent on the boot mode it now prepares an image header:
boot_v1.1
)Offset | Length | Content |
---|---|---|
0 | 1 | BIN_MAGIC_FLASH (0xE9) |
1 | 1 | Constant 3 |
2 | 1 | FlashMode (0:QIO, 1:QOUT, 2:DIO, 3:DOUT) |
3 | 1 | (FlashSizemap << 4) | FlashClockDivider |
4 | 4 | Address of entry point call_user_start |
The FlashSizemap
is again one of the known constants:
FlashSizemap |
Flash Size | Mapping |
---|---|---|
0 | 512 KB | 256 KB + 256 KB |
1 | 256 KB | |
2 | 1024 KB | 512 KB + 512 KB |
3 | 2048 KB | 512 KB + 512 KB |
4 | 4096 KB | 512 KB + 512 KB |
5 | 2048 KB | 1024 KB + 1024 KB |
6 | 4096 KB | 1024 KB + 1024 KB |
as well as the clock divider:
FlashClockDivider |
Factor |
---|---|
0 | 80m / 2 |
1 | 80m / 3 |
2 | 80m / 4 |
15 | 80m / 1 |
boot_v1.2+
)Offset | Length | Content |
---|---|---|
0 | 1 | BIN_MAGIC_FLASH (0xE9) |
1 | 1 | Constant 3 |
2 | 1 | 0 |
3 | 1 | user_bin |
4 | 4 | Address of entry point call_user_start |
none
)In this boot mode an additional header is added in front of the other image data:
Offset | Length | Content |
---|---|---|
0 | 1 | BIN_MAGIC_IROM (0xEA) |
1 | 1 | Constant 4 |
2 | 1 | 0 |
3 | 1 | user_bin |
4 | 4 | Address of entry point call_user_start |
Immediately following that header irom0.text.bin
is directly written into
the output file. This file is prepended by another header consisting of
two 32 bit integers that form an start_offset = 0
and length = (file_length + 15) & (~15)
field
Offset | Length | Content |
---|---|---|
0 | 4 | Start offset of IROM section (0) |
4 | 4 | Length rounded to multiples of 16 |
After that the same header as for boot mode 0 (boot_v1.1
)
as described above gets appended. All blocks that are not a multiple
of 16 get filled with zero bytes.
Following the header (or in boot mode none
(2) after the irom
header
and irom0.text.bin
) the .text
, .data
(only if _data_start
as previously extracted is set) and .rodata
sections are appended.
All section files get prepended with a start offset and length header:
Offset | Length | Content |
---|---|---|
0 | 4 | Start offset (see below) |
4 | 4 | Length rounded to multiples of 4 |
The start offset is either set to 0x40100000
in case of the .text
binary or to the previously fetched _data_start
or _rodata_start
from
the symbol table. This header instructs the bootloader to load sections at the
specified offsets.
Note that all payload bytes (including padding but excluding header) are added to
form a simple checksum. This checksum is formed by simply xor
ing all
data and padding bytes. This Checksum gets initialized with the magic
value 0xEF
.
The checksum header is formed by first padding the file excluding the last byte
with 0x00
bytes up to the flash data line size (i.e. 16 bytes) - these
bytes are not included in the checksum any more.
The last byte written is then the collected checksum created during the concatenation of the binary sections.
Only in boot mode 1 (boot_v1.2+
) the flash file gets padded up
to 0x10000
(i.e. 64 KByte) bytes with 0xFF
bytes. After that
the content of eagle.app.v6.irom0text.bin
gets simply appended byte
by byte.
At the end the file in boot modes boot_v1.2+
or none
a 32 bit
cyclical redundancy checksum (CRC) gets appended. This is calculated as the
CRC32 of the whole previously generated flash image file. This is again appended
in little endian order to the file.
After the script generated the eagle.app.flash.bin
as described above
the only thing left is copying the resulting files into the ../bin/
directory
mv eagle.app.flash.bin ../bin/eagle.flash.bin
mv eagle.app.v6.irom0text.bin ../bin/eagle.irom0text.bin
and cleaning up
rm eagle.app.v6.*
The script then tells us where to put the files into flash memory:
eagle.flash.bin-------->0x00000
eagle.irom0text.bin---->0x10000
The build process when building FOTA binaries is very similar to the above one. The
main difference is that the generated files are called user1
and user2
,
dependent on which partition they should be written into - and the linker scripts
supply corresponding adresses (hence the build has to be run two times - once
for user1 and once for user2). The other main difference the python script generating
the flash image is called with user_app=1
and user_app=2
. This leads
to a minor change inside the flash image header (in boot_mode 1 or 2).
The linker script used is eagle.app.v6.new.2048.ld
- one can see that the
script is adapted to only use half of the available memory.
At the end of the script run the makefile tells us where to flash the images:
boot.bin------------>0x00000
user1.4096.new.6.bin--->0x01000
or
boot.bin------------>0x00000
user2.4096.new.6.bin--->0x101000
Note that one normally does not flash user2
directly - thats normally
done via the cloud service during an upgrade.
Note that boot.bin
is the FOTA bootloader binary provided by espressif
that can be found at ${SDKROOT}/bin
. The files are currently
called boot_v1.2
, boot_v1.6
and boot_v1.7.bin
. The version
numbers may increase in future.
Note that there are some additionally provided binaries that might be required to put the ESP8266 into a useable state:
blank.bin
esp_init_data_default_v05.bin
esp_init_data_default_v08.bin
This article is tagged: Electronics, ESP8285, ESP8266, ESP32, Tutorial
Dipl.-Ing. Thomas Spielauer, Wien (webcomplains389t48957@tspi.at)
This webpage is also available via TOR at http://rh6v563nt2dnxd5h2vhhqkudmyvjaevgiv77c62xflas52d5omtkxuid.onion/