When developing bare metal applications it is required to supply some functions that we normally take for granted when developing code for mainstream OS’s. Setting the startup code is not inherently difficult but beware: some of the nastiest bugs you will ever see on bare metal can come from the startup code.
What is actually needed to start the execution of the main function? Well, there are a few things that the C and C++ language specifications assume when starting a new program. Some of them are:
- All uninitialized variables are zero. These are stored in the
.bsssection of the final elf file.
- All initialized variables are actually initialized. These are stored in the
.datasection of the final elf file.
- All static objects are initialized (they may need to get their constructors called if they are not trivial). Function pointers to these static initialization routines are stored in the
- The stack pointer is correctly set during startup. It actually necessary to set it up before even reaching the C or C++ code, since function definitions may store local variables, parameters and the return address on the stack. Depending on the ABI some of these may end up on processor registers for optimization purposes.
- Some other machine dependent features like enabling access to the floating point coprocessor (VFP coprocessor on most ARM microcontroller architectures).
For this article, we will examine the startup code needed when your code section (
.text) is placed into Flash memory, while the data (
.bss sections) is placed in SRAM, which is the most common scenario. Other memory layouts will need specific changes to make sure that the application can be run properly.
In the case of the arm-none-eabi-gcc toolchain, there are some sample startup codes for each processor located in the following directory (relative to the directory of the toolchain):
Default startup code is given in the assembly language for the target processor. It makes sense, however, reimplementing this code in C or C++, for the purposes of creating a more generic code that can be used for multiple devices. The sample code provided by ARM does the following tasks:
- Define the vector table for the NVIC (
__isr_vector). The NVIC is the interrupt controller. Upon an exception or interrupt it looks up the address of the corresponding ISR. This table contains the stack initialization value, the reset vector, all exception vectors, and external interrupt vectors. The code provides weak definitions that link the
default_handler(an endless loop) with all exceptions and interrupts. This serves as a trap for all undefined ISR’s by the application. When a system reset occurs, execution starts from the reset vector and the processor loads the value of the MSP (main stack pointer) with the highest ram address (defined by the linker as
- Initialize the
.bsssection to zero. It uses the symbols
__bss_end__to obtain the address range that should be set to zero. These symbols are defined in the linker script.
- Initialize the
.datasection. This section is defined in the linker script with different VMA (Virtual address) and LMA (Load address) since it has to be loaded to Flash, but used from RAM when the execution starts. To make sure that all C code can use the initialized data within the
.datasection, it has to be copied over from Flash to RAM by the startup code. To accomplish this it uses the following symbols defined by the linker:
__etext(this last symbol describes the end of the
.textsection, which is assumed to be the start address of the LMA of the
At the time of this writing (gcc-arm-none-eabi-7-2018-q2-update) the sample startup code doesn’t directly support C++. We will write the missing code so that C++ can be used with these processors without fear of running into trouble.
Adding support for C++ in the startup code (written in C++!)
In this section we will rewrite the startup code for an ARM Cortex-m core with added support for C++. First, let’s start with the NVIC Vector table. It can be defined as follows:
The code adds a
g_pfnVectors global vector table within the section
.isr_vector. This guarantees that this table will be placed at the start of the Flash memory since it is declared as an input section in the linker script before any other
.text section. This is where the processor will look for it right after booting.
__StackTop variable is actually defined in the linker script. We have simply declared its existence in our C++ code. Symbols defined in the linker script actually define the address of the variable matched by that symbol in C++, meaning that the variable
__StackTop will be located at the address determined by the symbol in the linker script. Since we need the location of the top of the stack, we store the address of the
The ResetHandler is the function that will get executed right after a system reset. When a reset occurs it will start executing the code at the address indicated by the
ResetHandler function pointer.
DEFINE_DEFAULT_ISR defines ISR functions that contain an endless loop. Since they are defined with the
weak attribute they can be easily overridden from the application code. Also, the interrupt attribute is used to indicate to the compiler that it needs to save the execution context before starting the execution of code within the ISR. It also restores the context before returning.
The next bit of code is the actual
ResetHandler. It looks like this:
This is all the startup code required. Keep in mind that this code doesn’t set the stack pointer. The stack pointer needs to hold a valid value before we can even begin to execute any C code. However, this is not necessary for these processors, since it gets set automatically by the processor using the value indicated in the NVIC vector table. We initialize the .data section using the
std::copy function from the C++ STL (This copies all values from the LMA to the VMA). The next bit of code fills all
.bss bytes with 0x00 by using the
Next is the initialization of static objects by calling their constructors. This step is only required for C++, since in C global variables can be initialized without calling any constructor and, as such, this step can be skipped (As the sample startup code provided within the arm-none-eabi-gcc toolchain does). However, keep in mind that if you forget to do this, static global objects will remain uninitialized and your program will not work as expected. For this, we are using the
std::for_each function, which acts as a wrapper for a for loop, making use of iterators.
Then we jump to the board initialization. In this case we are simply giving access to the floating point coprocessor. This may not be required if you are not using the coprocessor or you don’t have one.
The last step is to jump to the main function. Since the Gcc C++ compiler doesn’t allow explicit function calls for the main function I wrote this last part using inline ASM.
IMPORTANT NOTE: Never initialize the stack pointer again at the start of the
ResetHandler function if you write it in C or C++ (It must always be done before reaching any C code). Since
ResetHandler is actually a function it reserves memory for the local variables by moving the stack pointer upon function entry. Setting the stack pointer again after this effectively frees the memory allocated by the function for its local variables and may result in memory corruption that will haunt you all through the code. For example, it is not difficult to imagine that if the initialization value of 0x00 for the
.bss section was actually given through a local variable this variable would be freed after setting the stack pointer and subsequent calls to
std::copy may overwrite the value of the variable so that when your code reaches the
std::fill function the
.bss will be filled with some garbage values instead of 0.
Author Javier Alvarez