Writing ARM Assembly

= Overview = This page will go over the basics of writing ARM assembly on the OMAP platform against the GCC family of compilers and assemblers. If you have assembly that is in NASM format, you can port it over using the guide at Porting NASM Assembly to GCC. For OMAP4 Specifics, see Assembly Optimizations for OMAP4.

= Reference = For assembly instruction references, refer to ARM's site http://infocenter.arm.com/help/index.jsp for the specific processor type in the OMAP you are using (generally, OMAP1 - ARM926EJ-S, OMAP2 - ARM1136, OMAP3 - ARM Cortex A8, OMAP4 - ARM Cortex A9, though there are variants).

= Makefiles = You'll need to make sure that your Makefile supports cross compiling against the ARM assemblers. See OMAP Platform Support Tools. When compiling or assembling the assembly files, be sure to set your $(CC).

CC=$(CROSS_COMPILE)gcc AS=$(CROSS_COMPILE)as

= Assembly Files = Assembly files have historically been named with a .S or .s extension. Use .S to be able to pass the file through the C++ preprocessor as well as the assembler.

Parameters are named r0-r3 here to show how the assembly registers translates these into parameters. Parameters beyond 4 are pushed onto the stack. If you can't avoid going over this, there are ways to pull the additional parameters off the stack in the assembly into the r4-r11 registers in the prolog.

Comments
Comments should be either used with /* comment */ or the per line comment #.

Calling C functions from Assembly
Calling C functions from assembly is largely an issue of setting up the parameters correctly and then branching to the function.

In your C files define your function (no need to declare, unless other C functions call it). int somefunc(int r0, int r1, int r2, int r3) {   // does something }

In the Assembly File: .extern somefunc

And in the subroutine itself: # move parameters manually to r0, r1, r2, r3   bl somefunc # return code is in r0, r4-r11 should be preserved

If you need to add additional parameters to the stack you must also remove them after the function call to keep the sp correct.

Calling Assembly Functions from C
First, define your functions in a C header file so that the C/C++ code can find the prototype for it.

/** This is simple function which just returns 0 */ int function(int r0, int r1, int r2, int r3);

Second, you'll have to define the function or symbol in the assembly file. Naming it a global variable will allow the linker to find it and resolve the symbol in the C file.

.global function function:

EABI Calling conventions
In the EABI spec http://en.wikipedia.org/wiki/Application_binary_interface#EABI defines how functions are called, how stacks are used, which registers do what, etc. This allows assembly and C to link together successfully (even across different compilers which support EABI). The calling conventions can be found http://en.wikipedia.org/wiki/Calling_convention#ARM. The EABI standard dictates that the ARM Stack be "Full Descending" which means that stores need to decrement beforehand and loads must increment afterward. You can use the actual addressing types "DB" and "IA" or just "FD" on the assembly instructions.

Prolog
The prolog saves the state of the registers r4 through r11 typically (you can save any amount you need to, but those are the typical ones). This instruction also post-updates the stack pointer (sp).

stmdb sp!, {r4-r11} /* Push 8 "longs" on the stack and subtracts sp beforehand */

If there are additional parameters on the stack you can reference them after the stmia instruction, but you'll need to offset the sp by the appropriate values. This *assumes* that you use {r4-r11}.

ldr r4, [sp, #(4*9)] /* This loads parameter 5 which is 9 "longs" "up" on the stack now */ ldr r5, [sp, #(4*10)] /* This loads parameter 6 which is 10 "longs" "up" on the stack now */

Epilog
The epilog restores the previous register set from the stack back to the registers and updates the sp value.

ldmia sp!, {r4-r11}

Return
The return places the return value into r0 and moves the lr (the return address) into the pc. This will cause the next instruction fecthed to be the instruction after the call to the function.

mov r0, #0 mov pc, lr

Optimized Return
You can reduce your code size by also popping the LR from the stack back into the PC, which also acts as the "return" statement. Here I use the "FD" stack mode.

stmfd sp!,{r4-r11,lr} # stack save + return address ... ldr r4, [sp, #(4*10)] ... ldmfd sp!,{r4-r11,pc} # stack restore + return
 * 1) use 10 as the additional offset for other parameters off the stack since we're saving 9 ints now

Register Renaming
With the Gas style assemblers, you can rename registers to aid in readability.

name .req register

Example:

pixels .req r0 width .req r1 height .req r2

mul pixels, width, height

Complete Listing
.global function function: # prolog stmdb sp!, {r4-r11} ...    # epilog ldmia sp!, {r4-r11} # return value goes into r0, here it's zero mov r0, #0 mov pc, lr

Defining Strings
The assembler allows you to define strings in the format (with special characters):

.global final_message final_message: .string "Sorry for the Inconvenience\n"

Use a label before the string in order to reference it.

Defining Constants
The GNU assembler takes constants in the form of .equ symbol, value

Such that you could do this (capitalization is optional): .equ ANSWER_TO_LIFE_UNIVERSE_EVERYTHING, 42

Defining Data Arrays
When you need to define large static arrays of data (tables, precomputed values, multiple constants, etc.) you can use a data section to do this. This is not quite the same as the .data section (which can be static data or functions).

.global my_array my_array: .long 127 .long 28 .long 94 .long 23

This symbol can be then be used and to load these values into registers to apply to calculations, etc.

ldr r4, =my_array ldr r5, [r4, #0x0] ldr r6, [r4, #0x4] ldr r7, [r4, #0x8] ldr r8, [r4, #0xC]

Types
Each type can be zero (?) or more expressions.

.byte 247        /* is 8 bit  */ .word 2098       /* is 16 bit */ .long 10238476   /* is 32 bit */ .quad 23487928374 /* is 64 bit */ .octa 928374928734982734 /* is 128 bit */ .float 3.141528  /* is 32 bit IEEE floating point. */

.byte 0xEF, 0xBE, 0xAD, 0xDE /* Byte sequence 0xDEADBEEF in LITTLE ENDIAN */

Defining Macros
The GNU assembler also allows macros which can be used to simplify some assembly routines.

.macro name operand [,operand,...] [instructions] .endm

Here's an example that does a 4 value average .macro avgerage avg,sum,a,b,c,d add \sum, \a, \b add \sum, \c, \sum add \sum, \d, \sum mov \avg, \sum, lsr #2 .endm
 * 1) avg = (a+b+c+d)/4;

Odd's n' Ends
You should define your assembly file with .text at the beginning and .end at the end.

.text ... .end

Enabling NEON
If you are assembling for ARMv7 instructions (NEON) then you must state so in the Makefile in the AFLAGS as -march=armv7-a or -mfpu=neon. You can also state so in the assembly file as:

.arch armv7-a .fpu neon

Register Usage
There's a good table reference for which registers are used for what in GCC (during inline assembly at least) at, under "Register Usage".