Skip to content
Published On:
Jun 28, 2013
Last Updated:
Apr 21, 2025

If you are writing assembly for ARM processors, make sure to check out the GNU ARM Assembler Quick Reference.

An example of assembly code from the glibc library (an optimised memcpy() function).

Intel vs. AT&T Syntax

Intel Syntax

Intel syntax is the standard in the Windows ecosystem. Intel syntax always puts the destination address before the source address.1

instr dest, source
mov eax, [ecx]

If you see square brackets [] and/or the words WORD or PTR in the assembly code, it’s likely to be Intel syntax.

AT&T Syntax

AT&T syntax is the standard in the Unix ecosystem. AT&T syntax always puts the source address before the destination address. It also requires a % in-front of every register access, and a $ in-front of immediate operands.1

instr source, dest
movl (%ecx), %eax

If you see round brackets () and/or a % in-front of the operand, it’s likely to be AT&T syntax. AT&T is the default syntax generated by GCC.1

This page will use Intel syntax unless otherwise specified.

NASM

The NASM assembly language is a very popular language that works on the x86 and x86-64 architectures. It has similar syntax to Intel’s x86/x86-64 assembly, but it designed to be more “user friendly”.

Compiling NASM Assembly

You can download the NASM compiler for Ubuntu and other Debian-like systems with:

Terminal window
$ sudo apt install nasm

To compile a NASM .asm file, first convert it to an object file with the nasm program:

Terminal window
$ nasm -f elf test.asm

OR for a 64-bit object file:

Terminal window
$ nasm -f elf64 test.asm

This will produce an object file called myfile.o.

You then have two options for linking:

  1. Use the linker program ld.
  2. Use gcc. This automatically links against the C standard library for you.

Using gcc:

32-bit object files:

Terminal window
$ gcc -m32 test.o -o test

For 64-bit object files:

Terminal window
$ gcc -m64 test.o -o test

Hello World (using NASM)

This Hello, World example will be as bare-bones as possible, and will not even use the printf() function call (instead it will make a Linux system call directly).

First, we need a .section text to tell the compiler that this is our executable code.

.section text

We then need to define an entry point for our program:

global _start
_start:
... more code here ...

Now, printing Hello, world! requires the use of a Linux system call to print the text to stdout (specifically, sys_write). We will use the Linux fastcall convention which allows us to put the input arguments into registers, rather than on the stack. The system call number is placed in eax, and the arguments in the successive registers ebx, ecx, e.t.c. The function number for sys_write is 4 (see here). We need to place the file we wish to write to in ebx, a pointer to the message (char *) in ecx, and the number of characters in edx.

section .text
global _start
_start:
mov edx, <msg size>
mov ecx, <pointer to msg>
mov ebx, 1
mov eax, 4

Note that I have assigned the registers in reverse order (starting at edx and going to eax). This is not necessary (any order would work), however this practise stems from the sequence required making a system call by pushing to the stack instead, in where this order is important. I just kept the same practise for good readability.

So, now we need to place the message in memory somewhere, and also find out it’s size, so we can fill in <pointer to msg> and <msg size>. We can create a data section just for this purpose:

section .text
global _start
_start:
mov edx, <msg size>
mov ecx, <pointer to msg>
mov ebx, 1 ; File descriptor 1 is stdout
mov eax, 4 ; System call 4 is sys_write
section .data
msg db 'Hello, world!' ; Creates a string of characters. Note: No null terminator.
len equ $ - msg ; Calculate the length of the string (note, this does not store anything in memory)

db is a NASM instruction which stands for define bytes. It defines a sequence of bytes in memory. equ just tells the compiler to calculate the length of the message and save it to len (called a symbol), which can then be used in the code (it does not save the value to memory, it is similar to #define in C).

Lets use the new msg variable in our code:

section .text
global _start
_start:
mov edx, len
mov ecx, msg
mov ebx, 1 ; File descriptor 1 is stdout
mov eax, 4 ; System call 4 is sys_write
section .data
msg db 'Hello, world!' ; Creates a string of characters. Note: No null terminator.
len equ $ - msg ; Calculate the length of the string (note, this does not store anything in memory)

Great, everything is setup for a system call! But how do we perform a system call? In Linux, a system call is made by triggering the 0x80 interrupt, so we will do just that:

section .text
global _start
_start:
mov edx, len
mov ecx, msg
mov ebx, 1 ; File descriptor 1 is stdout
mov eax, 4 ; System call 4 is sys_write
int 0x80
section .data
msg db 'Hello, world!' ; Creates a string of characters. Note: No null terminator.
len equ $ - msg ; Calculate the length of the string (note, this does not store anything in memory)

Running the above code should print Hello, world! to stdout. But your program might then just seg fault. This is because we have not exited cleanly. To do this, we can make another sys call, this time to sys_exit (0x01).

section .text
global _start
_start:
mov edx, len ; len was calculated in .data section
mov ecx, msg ; Pointer to array of chars in .data section
mov ebx, 1 ; File descriptor 1 is stdout
mov eax, 4 ; System call 4 is sys_write
int 0x80 ; sys call
mov ebx, 0 ; Return code of o.k.
mov eax, 1 ; sys_exit function number
int 0x80 ; sys call
section .data
msg db 'Hello, world!' ; Creates a string of characters. Note: No null terminator.
len equ $ - msg ; Calculate the length of the string (note, this does not store anything in memory)

All done! Run this code online at https://www.tutorialspoint.com/tpcg.php?p=qjMuBp.

Hello World With printf()

Because we want to know use a standard library function, we are going to take a slightly different approach.

Create a test.asm file with the following contents:

extern printf ; Make sure to link with gcc so that standard library is automatically linked against
SECTION .data
msg: db "Hello, world!",0xA,0x0 ; 0xA = new line, 0x0 = end of string
SECTION .text ; Code section
global main
main: ; The program label for the entry point
push ebp ; Save stack base pointer
mov ebp, esp ; Move stack pointer onto stack base pointer
push msg ; Last thing on stack is printf formatting string
call printf
mov esp, ebp ; Unwind stack
pop ebp ; Replace base pointer
mov eax,0 ; Return error code of 0 (no error)
ret ; Return from main()

Now compile and run with the following commands:

Terminal window
$ nasm -f elf test.asm
$ gcc test.o -m32 -o test
$ ./test
Hello, world!

x86 Assembly

x86 Register Names

Knowing a little history about the x86 architecture helps to understand the naming of the registers. Initially it was 16-bits, and registers had names like ax, bx, cx, dx, etc. Then it was upgraded to 32-bits. The existing register names were kept as is, but new 32-bit register names were added which are prefixed with e. For example, eax is the 32-bit accumulator register. ax is the 16-bit accumulator register, and uses the lower 2 bytes of eax. The same process happened again when it was later extended to 64-bits, with the addition of r prefixes (e.g. rax).

This is a placeholder for the reference: tbl-x86-registers lists the x86 registers.2

64-bit32-bit16-bit8-bitDescription
raxeaxaxah/alAccumulator. Instructions like imul automatically save the result to this register.
rbxebxbxbh/blBase
rcxecxcxch/clCounter
rdxedxdxdh/dlData
rspespspsplStack Pointer
rbpebpbpbplBase Pointer
rsiesisisilSource Index
rdiedididilDestination Index
x86 register names.

In x86 language, a byte is 8-bits, a word is 16-bits, a double word is 32-bits, and a quad word is 64-bits. This terminology remains the same across 16-bit, 32-bit and 64-bit x86 architectures. All of the 8-bit registers listed in This is a placeholder for the reference: tbl-x86-registers access either the lower or higher 8-bits of the corresponding 16-bit register. For example, ah accesses the higher 8-bits of the ax register and al accesses the lower 8-bits.

rsp and rdb

rsp is the stack pointer register. The stack grows downwards, which means as it gets larger, the stack pointer register address gets smaller (i.e. closer to 0). push decrements rsp (and copies the operand onto the top of the stack) and pop copies the top of the stack into the operand and increments rsp (makes the stack smaller). It is also valid to manipulate rsp directly rather than using push/pop. For example, you can use sub rsp, 8 to make the stack smaller by 8 bytes.

rbp is the base pointer register. It points to the base of the current stack frame. The callee is responsible for preserving the value of rbp. Thus at the start and end of many functions, you will see the following assembly:

myFunction():
push rbp
mov rbp, rsp
; Rest of function code...
pop rbp
ret

x86 ABI

In x86-64, the first 6 function parameters are passed in registers. If there are more than 6, then the remaining parameters are passed on the stack. In x86-32, all function parameters are passed on the stack.

x86-64 Calling Conventions

There are two main calling conventions used for the x86-64 architecture:3

  1. System V AMD64 ABI
  2. Windows x64 ABI

Basic x86-64 Assembly Function Example

Let’s take a look at what a simple C function compiles to in x86-64 assembly. This assumes the compiler is following the System V AMD64 ABI (e.g. Linux, macOS).

Let’s start with the C function:

int square(int num) {
return num * num;
}

Compiling with x86-64 gcc 14.2 on godbolt.org gives us the following assembly code:

square(int):
push rbp
mov rbp, rsp
mov DWORD PTR [rbp-4], edi
mov eax, DWORD PTR [rbp-4]
imul eax, eax
pop rbp
ret

Note that:

  • edi destination index.
  • esi source index.
  • rsp is the stack pointer. push decrements rsp and copies the operand onto the top of the stack. pop copies the top of the stack into the operand and increments rsp.
  • rdp points to the base of the current stack frame.
  1. push rdp saves the address of the previous stack frame base pointer (which is currently in rbp) onto the stack. This is preserving the callers base pointer so it can be restored just before the callee returns (i.e. pop rbp).
  2. mov rbp, rsp saves rsp into rbp to create a new stack frame. Now, rbp serves as a stable reference point for accessing local variables and arguments within this function’s stack frame.
  3. mov DWORD PTR [rbp-4], edi copies the value of edi (the first argument) into the memory location [rbp-4]. According to the System V AMD64 calling convention, the first integer/pointer argument passed to a function is placed in the rdi register. edi refers to the lower 32 bits of the rdi register. Since the function takes an int (32 bits), the value is in edi. At higher optimization levels this operation may not be needed.
  4. mov eax, DWORD PTR [rbp-4] copies the value of [rbp-4] into eax.
  5. imul eax, eax multiplies the value of eax by itself and stores the result back in eax. eax happens to be the register the caller expects the result to be stored in, so this does not need to be moved anywhere.
  6. pop rbp restores the previous stack frame base pointer.
  7. ret pops the return address off the stack and jumps to it. The return address would have been placed onto the stack by the caller when call was executed.

Footnotes

  1. StackOverflow (2021, Jan 22). Limitations of Intel Assembly Syntax Compared to AT&T [forum post]. Retrieved 2025-04-23, from https://stackoverflow.com/questions/972602/limitations-of-intel-assembly-syntax-compared-to-att. 2 3

  2. StackOverflow. Assembly registers in 64-bit architecture [forum post]. Retrieved 2025-04-22, from https://stackoverflow.com/questions/20637569/assembly-registers-in-64-bit-architecture.

  3. Wikipedia (2025, Mar 19). x86 calling conventions. Retrieved 2025-04-22, from https://en.wikipedia.org/wiki/X86_calling_conventions.