Solving Sum To N in x86‑64 Assembly
My solution to the problem: write a program in x86-64 Assembly which given an integer n returns the sum of numbers from 1 to n inclusive. This problem is part of CSPrimer. Test cases for this problem are not included in this post. This solution computes the sum of numbers from N down to 1. This post shares my thought process when solving this problem but does not explain assembly basics, so if you don’t know a bit of assembly, this post might be tricky to follow.
My solution:
section .text
global sum_to_n
sum_to_n:
xor rax, rax
loop:
cmp rdi, 0
jle done
add rax, rdi
sub rdi, 1
jmp loop
done:
ret
The function sum_to_n calculates the sum of all integers from n down to 1. The function expects an integer n to be passed in the rdi register and returns the result in the rax register.
Steps:
Initialize sum (rax) to 0.
Loop:
Add the current value of rdi to rax.
Subtract 1 from rdi.
Continue looping until rdi is less than or equal to 0.
Once the loop is finished, the sum is returned in rax.
Solution line by line
section .text
This tells the assembler that the code is part of the text section. In assembly programs, code is divided up into different sections such as .text, for instructions, or .data for data.
global sum_to_n
This tells the assembler that sum_to_n should be visible to the linker, so we can access it from other files.
sum_to_n:
Here I’ve created a label for the sum_to_n function. When this function is called, we’ll jump to this location.
xor rax, rax
We need an accumulator for this function. Meet RAX, our accumulator register, where we can store the return value of our function. The CPU uses registers to store and manipulate data as registers are faster than RAM. There’s much more to it than that, but for the sake of brevity, we’ll leave it there. By using XOR, we zero out our register. Our XOR is performing a bitwise XOR. For this, at first I wrote mov rax, 0, which is moving 0 into rax or setting it to 0, but I read that using XOR is faster as it doesn’t access memory, so I used this instead.
Next, enter our loop and our compare:
loop:
cmp rdi, 0
What is RDI? RDI is a register and it is used to pass the first function argument.
Our loop is another label, it marks a location in our code that can be jumped to. Later when we say JMP LOOP, we’re saying, jump to the location of loop and start executing there. This allows us to create our loop, a construct that allows our code to keep repeating until our specified condition is met. In this case, if RDI is less than or equal to zero, we’ll jump to done and exit our loop. But if it’s not, we’ll keep looping.
For example, if N = 5, that means RDI will contain 5. In other words if 5 is passed into our function, RDI will contain 5.
We also have CMP, compare. Here we are checking if RDI is zero. We do this because we need to stop our loop when RDI reaches zero. For each iteration of our loop, we add RDI to RAX, the sum, and we subtract 1 from RDI. Eventually, RDI will reach 0.
jle done
This is our IF condition. If RDI is less than or equal to zero, we jump to done and exit our loop. This is where we’ll exit our loop and return the sum. If RDI is less than or equal to zero, we exit our loop and head to our DONE label, and then we return the sum, with our RET line of code. If RDI is greater than 0, we’ll continue onto the next line and add RAX and RDI again.
add rax, rdi
Here we add RDI to RAX, with RAX being our sum and RDI being our current number. The result will update RAX.
sub rdi, 1
Now we subtract 1 from RDI. This our count down, where we decrement RDI.
jmp loop
Here we are just saying, jump back to loop. This is where we start our loop again.
done:
ret
RET tells the CPU to return from the function. We’ll return the sum, which is stored in RAX. Function return values are automatically stored in RAX, so we can just say RET, rather than RET RAX.
As I’m not an expert in assembly (well, who is these days?), writing this code took a painstakingly long time. But I’m much more confident in my understanding of assembly now than I was before I tried to solve this problem.
If you have any additional insights or a different way of thinking about this problem, feel free to comment.