Looping
We're going to write a program to calculate exponents. Before we can do that, we need to discuss looping.
A loop is a series of instructions which can be executed repeatedly. When execution reaches the end, it jumps back to the beginning and runs the loop again. This continues until some criteria is reached which causes the loop to stop. Take a look at the following minimal example:
mov rax, 0
loop_start:
add rax, 2
jmp loop_start
This snippet starts by setting rax
to 0. Then, it continually adds 2 to
rax
. Let's look at it in more detail:
mov rax, 0
Before the loop begins, the register rax
is set to 0.
loop_start:
This is a label, marking the start of the loop. We can jump to this label any time we want the loop to run again.
add rax, 2
This is the body of the loop. This instruction is executed each time the loop
runs. It adds 2 to the value of rax
, so every time the loop runs, rax
will
be increased by 2. Since rax
starts at 0, if the loop runs 10 times, rax
will be set to 20 by the end.
jmp loop_start
This instruction causes the loop to start over. Every time this instruction
is reached, the loop will start back over at the beginning. This will
cause add rax, 2
to run over and over again.
There's just one problem with this loop: it never ends. This is what is known as an infinite loop, because it continues forever. If you were to run a program with this loop inside, that program would appear to freeze. The loop would run continually until the program or computer was forcibly halted.
In order to be useful, a loop must have some termination criteria: some condition under which the loop ceases to repeat. Take a look at the following updated snippet:
mov rax, 0
loop_start:
add rax, 2
cmp rax, 10
jl loop_start
This is a bit different. rax
still starts at 0, and 2 is still added to rax
each time the loop runs. But now, instead of the unconditional jump at the
end of the loop, there is now a conditional jump, which only repeats the loop
if certain criteria are met.
cmp rax, 10
This instruction compares the value of rax
to 10. The possible results of
this comparison are:
rax
could be greater than 10rax
could be equal to 10rax
could be less than 10
When this instruction runs, the results of the comparison will be stored in the
rflags
register. This instruction does not act on the result of the
comparison by itself, it sets things up so the next instruction can.
jl loop_start
This is a conditional jump. It only jumps to the given label loop_start:
if rax
is less than 10. If rax
is greater than or equal to 10, the jump
doesn't happen and the loop ends.
There are several variants of the conditional jump instruction. The jl instruction above stands for "jump if less than".
Altogether, this snippet's behavior can be summarized as follows:
rax
is set to its starting value of 0.rax
is less than 10, so the loop restarts.rax
is increased to 2.rax
is less than 10, so the loop restarts.rax
is increased to 4.rax
is less than 10, so the loop restarts.rax
is increased to 6.rax
is less than 10, so the loop restarts.rax
is increased to 8.rax
is less than 10, so the loop restarts.rax
is increased to 10.rax
is not less than 10, so the loop does not restart.
The loop runs 5 times, adding 2 to rax
each time, until rax
reaches 10.
A basic loop with output
This small program is going to use a loop to print a line of text to the console over and over. Take a look at the following code:
%define sys_write 1
%define stdout 1
%define sys_exit 60
%define newline 10
section .data
output: db "Greetings!", newline
output_len: equ $-output
section .text
global _start
_start:
; The number of times to print the text out
mov rbx, 7
loop_start:
; Print the text to the console
mov rax, sys_write
mov rdi, stdout
mov rsi, output
mov rdx, output_len
syscall
; Decrement the loop counter
dec rbx
; Continue the loop while rbx > 0
cmp rbx, 0
jg loop_start
; Exit the program
mov rax, sys_exit
mov rdi, 0
syscall
We're using rbx
to keep track of how many times to print the text. Each time
the loop runs, it:
- Prints the text out
- Subtracts 1 from the loop counter
rbx
- Starts over if
rbx
is still greater than 0
In closer detail:
section .data
output: db "Greetings!", newline
output_len: equ $-output
We've added a data section containing two values:
- output - the string to print to the console
- output_len - the number of characters in the output string
; The number of times to print the text out
mov rbx, 7
This sets the number of times the loop will run and keeps track of when to stop repeating the loop.
loop_start:
This is the beginning of the loop.
; Print the text to the console
mov rax, sys_write
mov rdi, stdout
mov rsi, output
mov rdx, output_len
syscall
This is the body of the loop. Here we print the string output to the console. This will be executed repeatedly each time the loop runs.
; Decrement the loop counter
dec rbx
Now the loop counter in rbx
has to be decremented to keep track of how many
times the loop has run. Each time the loop runs, we subtract 1 from rbx
. So
at any point during the program, rbx
contains the number of iterations left
to run before the loop will be finished.
; Continue the loop while rbx > 0
cmp rbx, 0
jg loop_start
Here we check the loop counter against 0. If rbx
is greater than 0, the loop
continues. If rbx
has reached 0, the loop ends.
; Exit the program
mov rax, sys_exit
mov rdi, 0
syscall
At this point, the loop will have run 7 times and then stopped. The program exits.
Type the program above into a file called "printspam.asm" and run it. You
should see the text "Greetings!" written out 7 times. Try changing the printed
text to something else. Also try changing the initial value of rbx
from 7 to
something else. Whatever value rbx
starts with is the number of times the
string will be printed.
However, there's actually a bug in this program: if you set rbx
to 0, the
string will still be printed once. This is because we're using the wrong loop
style for the job. The loop in this program is called a do..while loop, which
works like this:
- Print the output string
- Check if the loop should end yet and start over if not
Notice that the test to decide whether the loop should end doesn't happen until
the end of the loop, after the string has already been printed. This means that
the loop will always run at least one time, since we don't check if it should
keep going until after it has already run. No matter what value you give to
rbx
, it will always print the string at least one time.
This style of loop is called a "do..while" loop. The conditional check happens at the end of the loop body, so the loop always runs at least once no matter what the result of the conditional check is.
We can solve this problem by using a while loop. A while loop is another style of loop where the conditional check happens at the beginning of the loop, before the print operation. The loop will be reorganized to look more like this:
- Check if the loop should end yet and jump out of the loop if so
- Print the output string
- Go back to step 1
By checking to see if the loop should end at the very beginning of the loop,
we prevent the output string from printing at all if rbx
is 0 or a negative
number.
Take a look at the updated program:
%define sys_write 1
%define stdout 1
%define sys_exit 60
%define newline 10
section .data
output: db "Greetings!", newline
output_len: equ $-output
section .text
global _start
_start:
; The number of times to print the text out
mov rbx, 0
loop_start:
; Check if the loop should end yet
cmp rbx, 0
jle loop_stop
; Print the text to the console
mov rax, sys_write
mov rdi, stdout
mov rsi, output
mov rdx, output_len
syscall
; Decrement the loop counter
dec rbx
; Run the loop again
jmp loop_start
loop_stop:
; Exit the program
mov rax, sys_exit
mov rdi, 0
syscall
Now the conditional check happens at the beginning of the loop. Let's go over the changes in more detail:
; The number of times to print the text out
mov rbx, 0
rbx
now starts at 0. The string should never be printed if the counter starts
at 0.
loop_start:
This is the start of the loop.
; Check if the loop should end yet
cmp rbx, 0
jle loop_stop
At the very beginning of each loop iteration, we check to see if the loop should end yet.
The cmp instruction is the same as before, but we're using a different form
of conditional jump. jle stands for "jump if less than or equal
to". So once rbx
hits 0, we jump out of the loop to the loop_stop: label.
As long as rbx
is greater than 0, we run the loop body:
; Print the text to the console
mov rax, sys_write
mov rdi, stdout
mov rsi, output
mov rdx, output_len
syscall
The output string is printed to the console.
; Decrement the loop counter
dec rbx
The loop counter is decremented, to keep track of the number of times the loop has run.
; Run the loop again
jmp loop_start
Here we jump back to the start of the loop. This is an unconditional jump, meaning it jumps no matter what. Since we now check if the loop should continue at the beginning of the loop, we don't need to check it here at the end.
loop_stop:
This is where we jump when the loop ends. Once rbx
hits 0, the jle
instruction will jump here, breaking out of the loop.
Type the new program into a file (or edit the old one) and run it again. You
should see that the bug has been fixed. If rbx
is set to 0, the string
never prints. If rbx
is set to a positive integer (5, 7, etc), the string is
printed that number of times.
Exponents
Now we're going to write a program to calculate exponents. We want to be able
to take an input like 2 and a power like 3 and calculate the result. With
those example values, we should get 2 ^ 3 = 8
. 2 ^ 3
is the same as
2 * 2 * 2
. So we can calculate this by repeatedly multiplying a value against
itself. To do this, we'll need to use a loop similar to the ones introduced
above.
Take a look at the following program:
%define sys_exit 60
section .text
global _start
_start:
; Starting values: calculating 2 ^ 3
mov rbx, 2
mov rcx, 3
; This stores the result, which starts as 1
mov rax, 1
loop_start:
; Compare rcx to 0
cmp rcx, 0
; Break loop once rcx reaches 0
jle loop_stop
; Multiply rbx by rax, storing the result each time in rbx
imul rax, rbx
; Decrement rcx
dec rcx
; Start the loop again
jmp loop_start
loop_stop:
; End the program
mov rdi, rax
mov rax, sys_exit
syscall
Let's break it down line-by-line:
mov rbx, 2
mov rcx, 3
These are the "inputs" of the program. Since they're hard-coded into the
source, they're not technically inputs, but they are the values we'll be
operating on. Since we're trying to calculate 2 ^ 3
, both values need to be
in registers so we can work with them.
The basic idea here is we're going to have a loop which multiplies the value
in rbx
by itself. rcx
will keep track of how many times this multiplication
still needs to occur.
mov rax, 1
rax
will store the running total. We're going to multiply the value in rbx
(2) against the value in rax
the number of times specified by rcx
(3). So
we start with rax
set to 1, since 1 * 2 * 2 * 2 = 2 ^ 3
.
loop_start:
This marks the beginning of the loop. The loop body will be run 3 times.
Execution will jump back to this point repeatedly, until rcx
reaches 0 and
the final answer is stored in rax
.
; Compare rcx to 0
cmp rcx, 0
We start by checking if the exponent rcx
has reached 0. Since rcx
might
start at 0 (3 ^ 0 = 1
), we have to do this check before running the loop.
; Break loop once rcx reaches 0
jle loop_stop
If rcx
has reached 0, we end the loop by jumping out of it to the label
loop_stop:. This means that the loop will continue running until rcx
reaches 0.
; Multiply rbx by rax, storing the result each time in rbx
imul rax, rbx
This is where the actual multiplication happens. Each time the loop runs, we
multiply the running total stored in rax
by the base number in rbx
.
Each time a loop runs is called an iteration. Take a look at the following
table, which lists each iteration and shows how rax
increases each time as
it's multiplied by 2:
Iteration | rax starting value | rax ending value |
---|---|---|
1 | 1 | 2 |
2 | 2 | 4 |
3 | 4 | 8 |
The loop runs three times, each time multiplying the value in rax
by 2.
dec rcx
We don't want the loop to run forever, so we need a way of keeping track of how
many times it's been run. We start the program by setting rcx
to the value of
3 as that's how many times we want to run the loop. So each time the loop
runs, we need to reduce the value of rcx
by 1. That's what the dec
instruction does: it subtracts 1 from whatever register you give it. This is
also known as decrementing.
Note: We could have also used the sub instruction like this:
sub rcx, 1
and it would have worked the same. dec rcx
and sub rcx, 1
are functionally equivalent.
See the following table which lists each iteration, including the value of
rcx
at the end of the loop (after the dec rcx
line) for each one:
Iteration | rax ending value | rcx ending value |
---|---|---|
1 | 2 | 2 |
2 | 4 | 1 |
3 | 8 | 0 |
For the first two iterations, rcx
is greater than 0. The loop continues and
rax
is continually multiplied by 2. On the third iteration, rcx
reaches 0
and the loop stops. At this point, rax
is left with its final value of 8,
which is the result of 2 ^ 3
.
; Start the loop again
jmp loop_start
This is the end of the loop. We jump back to the beginning to keep the loop running.
loop_stop:
This is the label we jump to in order to end/break the loop. Once rcx
hits
0, the conditional jump instruction jle will jump here, breaking out of the
loop and preventing it from running again.
mov rdi, rax
mov rax, sys_exit
syscall
When the loop ends, the result will be stored in rax
(and it should be 8).
In order to return the value as a status exit code, it needs to be in rdi
, so
we move it there and then end the program. 8 gets returned as the status code.
Type the program into a file called "exponent.asm" and run it with the "run" script:
./run exponent
You should see 8 returned on the console. Try changing the values of rbx
and
rcx
to calculate different exponents.
For example, try 7 ^ 3
. Modify the lines at the beginning that set the
inputs like this:
mov rbx, 7
mov rcx, 3
This should give us 7 * 7 * 7 = 343
, right? Wrong! System status codes use
only one byte of memory, which means they can only store a number up to 255.
When we try to return 343 in a value that can only be a maximum of 255, the
value overflows since it can't go any higher. Instead of returning 343, it
returns 87, which is 343 - 256
. This limitation makes status codes a bad way
to get this kind of output from a program. Eventually, we'll work through
converting numbers of (virtually) any size into ASCII strings and printing
those out, but we're not quite there yet.