[C++] Simple functions in x64 assembly
To learn x64 assembly (asm) I’ll document the disassembly of some simple C++ functions.
The examples were compiled on Godbolt with MSVC’s latest version (v19.4) using O0
and O2
.
Identity Function
auto identity(int x) {
return x;
}
A simple identity function.
x$ = 8
identity(int) PROC ; identity
mov DWORD PTR [rsp+8], ecx
mov eax, DWORD PTR x$[rsp]
ret 0
identity(int) ENDP ; identity
Let’s go through it line by line.
The code uses the MASM syntax which takes the form of instruction destination, source
.
x$ = 8
x$
is a simple constant.
identity(int) PROC ; identity
This block denotes the start of the function.
mov DWORD PTR [rsp+8], ecx
The mov
instruction simply moves a value from one place to another.
The source and destination can either be a register or memory.
Square brackets denote accessing memory.
Here we move the value in register ecx
into memory at the address 8 bytes above the stack pointer.
In C++ this would look like.
*(rsp + 8) = ecx;
By convention, the first four parameters of a Windows function are placed in registers rcx
, rdx
, r8
, and r9
.
These are then moved to stack memory when the function begins1.
mov eax, DWORD PTR x$[rsp]
x
is copied from memory to register eax
for the return value.
Integer return values are stored in rax
(eax
is simply the lower 32 bits of the full 64 bit register).
ret 0
The ret
instruction returns from the function to the calling address.
Now let’s look at the optimised version of the function.
x$ = 8
identity(int) PROC ; identity, COMDAT
mov eax, ecx
ret 0
identity(int) ENDP ; identity
x
is just moved from ecx
into eax
. That’s it.
+1 function
auto add1(int x) {
return x + 1;
}
A simple increment function.
x$ = 8
add1(int) PROC ; add1
mov DWORD PTR [rsp+8], ecx
mov eax, DWORD PTR x$[rsp]
inc eax
ret 0
add1(int) ENDP ; add1
The inc
instruction adds 1
to its only operand.
x$ = 8
add1(int) PROC ; add1, COMDAT
lea eax, DWORD PTR [rcx+1]
ret 0
add1(int) ENDP
With optimisations we encounter the lea
instruction.
It stands for “load effective address” and it stores the result of the rhs expression in the destination (it doesn’t actually access memory).
It’s used for calculating memory offsets but it’s often used for efficient mathematics2.
In this case we’re storing rax+1
in the eax
register so we can return immediately.
Integer multiplication
auto multiply(int x, int y) {
int z{x * y};
return z;
}
Integer multiplication with a stack variable.
z$ = 0
x$ = 32
y$ = 40
multiply(int,int) PROC ; multiply
$LN3:
mov DWORD PTR [rsp+16], edx
mov DWORD PTR [rsp+8], ecx
sub rsp, 24
mov eax, DWORD PTR x$[rsp]
imul eax, DWORD PTR y$[rsp]
mov DWORD PTR z$[rsp], eax
mov eax, DWORD PTR z$[rsp]
add rsp, 24
ret 0
multiply(int,int) ENDP ; multiply
This example has more to go through but it’s still simple.
mov DWORD PTR [rsp+16], edx
mov DWORD PTR [rsp+8], ecx
sub rsp, 24
The function prolog.
Parameters x
and y
are stored in two of the four reserved registers and then moved to memory.
The stack pointer address is reduced by 24 to account for the three variables in the function (8*3=24
).
All memory for the function is reserved up front.
Variables can then be accessed with offsets from rsp
instead of having to move it about.
mov eax, DWORD PTR x$[rsp]
imul eax, DWORD PTR y$[rsp]
mov DWORD PTR z$[rsp], eax
mov eax, DWORD PTR z$[rsp]
x
is moved from memory into the return register eax
and then multiplied by y
.
This result is moved into z
’s address in memory before being moved back to eax
as the return value.
Somewhat wasteful.
add rsp, 24
ret 0
We reset the stack pointer to its original address and return.
x$ = 8
y$ = 16
multiply(int,int) PROC ; multiply, COMDAT
imul ecx, edx
mov eax, ecx
ret 0
multiply(int,int) ENDP ; multiply
For the optimised version the two parameters are directly multiplied in the registers and moved to eax
.