Electronic – ARM startup file attributes vs GCC command line arguments

assemblycortex-mgcc

The startup files for STM32 Cortex-M MCU's, for most GCC toolchains, often bundle the Atollic TrueStudio startup assembly files with HAL libraries, like for example in my case, STM32CubeF4.

I'm looking at startup_stm32f407xx.s, and it starts with a section looking as follows

.syntax unified
.cpu cortex-m4
.fpu softvfp
.thumb

I want to rewrite the assembler startup script to C, as part of learning the Cortex-M startup process.

When compiling with the GCC ARM Toolchain, or perhaps any other GCC based toolchain, does that mean I have to transfer these arguments, found in the assembly startup file, to command line arguments for arm-none-eabi-gcc:

arm-none-eabi-gcc -mcpu=cortex-m4 --mfpu=softvfp --mthumbz ...

Do these assembler lines correlate to the respective GCC arguments, or are they used for something entirely different?

Best Answer

This page gives a nice overview over the "special" assembler directives of the GNU ARM assembler.

As you suspected, these directives are basically used in the way of the compiler switches and should have their representation when compiling the sources.

The ones used:

  • .syntax [unified | divided]: This directive sets the Instruction Set Syntax as described in the ARM-Instruction-Set section. (unified: ARM and THUMB use the same syntax)
  • .cpu cortex-m4: Select the target processor. Valid values for name are the same as for the -mcpu commandline option. Specifying .cpu clears any previously selected architecture extensions.
  • .fpu softvfp: Select the floating-point unit to assemble for. Valid values for name are the same as for the -mfpu commandline option
  • .thumb: This performs the same action as .code 16.
  • .code 16: This directive selects the instruction set being generated. The value 16 selects Thumb, with the value 32 selecting ARM.

Some, if not all, of them can also be configured via the command-line interface. I'd guess that they included it in the assembly file to make sure, that it gets assembled exactly the way it was thought to be. The inline directives take a higher priority as the command line switches.


As for the description of the start-up process, don't know if you actually asked this, but I felt like writing it:

From a hardware point of view it is a thing of the core and is described in the ARMv7-M Architecture Reference Manual (available upon registration). In section B1.5.5 the reset behaviour is explained.

Asserting reset causes the processor to abandon the current execution state without saving it. On the deassertion of reset, all registers that have a defined reset value contain that value, and the processor performs the actions described by the TakeReset() pseudocode.

// TakeReset()
// ============
TakeReset()
CurrentMode = Mode_Thread;
PRIMASK<0> = '0'; /* priority mask cleared at reset */
FAULTMASK<0> = '0'; /* fault mask cleared at reset */
BASEPRI<7:0> = Zeros(8); /* base priority disabled at reset */
if HaveFPExt() then /* initialize the Floating Point Extn */
CONTROL<2:0> = '000'; /* FP inactive, stack is Main, thread is privileged */
CPACR.cp10 = '00';
CPACR.cp11 = '00';
FPDSCR.AHP = '0';
FPDSCR.DN = '0';
FPDSCR.FZ = '0';
FPDSCR.RMode = '00';
FPCCR.ASPEN = '1';
FPCCR.LSPEN = '1';
FPCCR.LSPACT = '0';
FPCAR = bits(32) UNKNOWN;
FPFSR = bits(32) UNKNOWN;
for i = 0 to 31
S[i] = bits(32) UNKNOWN;
else
CONTROL<1:0> = '00'; /* current stack is Main, thread is privileged */
for i = 0 to 511 /* all exceptions Inactive */
ExceptionActive[i] = '0';
ResetSCSRegs(); /* catch-all function for System Control Space reset */
ClearExclusiveLocal(ProcessorID()); /* Synchronization (LDREX* / STREX*) monitor support */
ClearEventRegister(); /* see WFE instruction for more details */
for i = 0 to 12
R[i] = bits(32) UNKNOWN;
bits(32) vectortable = VTOR<31:7>:'0000000';
SP_main = MemA_with_priv[vectortable, 4, AccType_VECTABLE] AND 0xFFFFFFFC<31:0>;
SP_process = ((bits(30) UNKNOWN):'00');
LR = 0xFFFFFFFF<31:0>; /* preset to an illegal exception return value */
tmp = MemA_with_priv[vectortable+4, 4, AccType_VECTABLE];
tbit = tmp<0>;
APSR = bits(32) UNKNOWN; /* flags UNPREDICTABLE from reset */
IPSR<8:0> = Zeros(9); /* Exception Number cleared */
EPSR.T = tbit; /* T bit set from vector */
EPSR.IT<7:0> = Zeros(8); /* IT/ICI bits cleared */
BranchTo(tmp AND 0xFFFFFFFE<31:0>); /* address of reset service routine */

ExceptionActive[*] is a conceptual array of active flag bits for all exceptions, meaning it has active flags for the fixed-priority system exceptions, the configurable-priority system exceptions, and the external interrupts. The active flags for the fixed-priority exceptions are conceptual only, and are not required to exist in a system register.

The steps you have to take to implement the startup behaviour in software is dependent on the compiler and linker you use, so a general solution will probably not exist.

It usually consists of creating a reset and vector table structure and filling it with the right values. The stackpointer is the first value, it has to be initialized with the RAM address where your stack will reside. This value is usually defined in the linker control file as exported symbol. The next value is the address of your startup code, so you just put the function pointer to c_startup() or whatever there.

What follows is a long list of function pointers pointing to the individual interrupt service handlers. You have to take care, that you don't skip positions in the table just because the ISR is not implemented. Initialize those values either with 0 (things can go wrong) or with a "not-implemented handler" consisting of a while(true) (safe way).

After that you have to fill the c_startup() with life, which depends on the compiler. Typical things to do are: initializing RAM with global values, initialize the FPU is, calling constructors of static objects and finally jumping to main().

As a last step you have to tell the linker that it places your newly created super-structure to the very beginning of the vector table (usually the first address of the flash, but this might vary depending on how the vector is fetched in the device).

Related Topic