64-Bit Assembler Lab
- skhan4059
- Apr 23, 2022
- 6 min read
Previously we were using 6502 assembler and practicing how the assembly language works:
This time we will be stepping it up and be using aarch64 assembly to make a simple program. 64 bit assembler is much more complex and the commands are much different from 6502, but with the help of the internet, I think we should be able to make everything work. A link to the lab is here:
We start off with a sample package called hello on dedicated server for this course. We copy and unpack it using tar and it should look like this:
spo600
└── examples
└── hello # "hello world" example programs
├── assembler
│ ├── aarch64 # aarch64 gas assembly language version
│ │ ├── hello.s
│ │ └── Makefile
│ ├── Makefile
│ └── x86_64 # x86_64 assembly language versions
│ ├── hello-gas.s # ... gas syntax
│ ├── hello-nasm.s # ... nasm syntax
│ └── Makefile
└── c # Portable C versions
├── hello2.c # ... using write()
├── hello3.c # ... using syscall()
├── hello.c # ... using printf()
└── Makefile
The objective is to make a program that can loop and print out a single statement 10 times. We are given a template for the loop as follows:
.text
.global _start
min = 0 /* starting value for the loop index; note that this is a symbol (constant), not a variable */
max = 30 /* loop exits when the index hits this number (loop condition is i<max) */
_start:
mov x19, min
loop:
/* ... body of the loop ... do something useful here ... */
add x19, x19, 1
cmp x19, max
b.ne loop
mov x0, 0 /* status -> 0 */
mov x8, 93 /* exit is syscall #93 */
svc 0 /* invoke syscall */
The important stuff goes in the center of the loop which in this case will be code to write the word "loop" 10 times, so I looked through the template file in hello.s in the aarch64 directory and join that with this code which resulted in:
.text
.global _start
min = 0 /* starting value for the loop index; note that this is a symbol (constant), not a variable */
max = 10 /* loop exits when the index hits this number (loop condition is i<max) */
_start:
sub sp, sp, 16
mov x19, min
loop:
mov x0, 1 /* file descriptor: 1 is stdout */
adr x1, msg /* message location (memory address) */
mov x2, len /* message length (bytes) */
mov x8, 64 /* write is syscall #64 */
svc 0 /* invoke syscall */
add x19, x19, 1
cmp x19, max
b.ne loop
add sp, sp, 16
mov x0, 0 /* status -> 0 */
mov x8, 93 /* exit is syscall #93 */
svc 0 /* invoke syscall */
.data
msg: .ascii "Loop\n"
len= . - msg
which gives us the result of:
Loop
Loop
Loop
Loop
Loop
Loop
Loop
Loop
Loop
Loop
Just as expected.
The next part is quite a bit more difficult, it requires us to print the number of times the loop ran starting from 0 all the way to 9. We need to modify our code to convert x19 register to the ascii character equivalent. In order to we need to add a value of 48 to it because that is the difference in actual value between the integer 0 and the ascii character value of 0. So, after adding 48 to it, we still need a place to store it, as we can't print directly from the register so we will save it on the so convenient stack.
One thing to note here is that in comparison to the 6502 assembly, this is more complex but also a lot more intuitive. We no longer need to put the character in a specific address to print them on a screen and we can just right them out using system calls to standard output.
Here is the code that we get:
.text
.global _start
min = 0 /* starting value for the loop index; note that this is a symbol (constant), not a variable */
max = 10 /* loop exits when the index hits this number (loop condition is i<max) */
_start:
sub sp, sp, 16
mov x19, min
loop:
add x13, x19, 48 /* loop variable + 48 goes into register 13 (or whichever one you choose) */
strb w13, [sp] /* register 13 stored into memory */
mov x0, 1 /* file descriptor: 1 is stdout */
adr x1, msg /* message location (memory address) */
mov x2, len /* message length (bytes) */
mov x8, 64 /* write is syscall #64 */
svc 0 /* invoke syscall */
mov x0, 1 /* printing code */
mov x1, sp
mov x2, 1
mov x8, 64
svc 0
mov x13, 10 /* stores ascii value of newline to x13 */
strb w13, [sp] /* put value on stack */
mov x0, 1
mov x1, sp
mov x2, 1
mov x8, 64
svc 0
add x19, x19, 1
cmp x19, max
b.ne loop
add sp, sp, 16
mov x0, 0 /* status -> 0 */
mov x8, 93 /* exit is syscall #93 */
svc 0 /* invoke syscall */
.data
msg: .ascii "Loop: "
len= . - msg
Basically we use the x13 register to store the the incrementing index by 48, save this value on the stack. This leads to another problem, we now need to place the newline after we print the number somehow. I did this by following the same steps for printing the numbers, except in this case we use the constant value of 10 and put that on the stack. Results in:
The next has us do the same thing but we need to modify the program so that we can print until 30. This will prove to be quite the hurdle as our current code is only equipped to handle singe digit numbers. At this part I got really stuck so I decided to consult the internet where I found a very useful thread:
The idea is to use integer division to get the ten's digit by itself, then using the msub instruction which will multiply the ten's digit by ten and then subtract it with the index which will effectively give us our one's digit, then print each out separately. The code results in this:
.text
.global _start
min = 0 /* starting value for the loop index; note that this is a symbol (constant), not a variable */
max = 30 /* loop exits when the index hits this number (loop condition is i<max) */
_start:
sub sp, sp, 16
mov x19, min
mov x20, 10
loop:
mov x0, 1 /* file descriptor: 1 is stdout */
adr x1, msg /* message location (memory address) */
mov x2, len /* message length (bytes) */
mov x8, 64 /* write is syscall #64 */
svc 0 /* invoke syscall */
cmp x19, 9
b.gt nine_hundred
zero_nine:
add x13, x19, 48 /* loop variable + 48 goes into register 13 (or whichever one you choose) */
strb w13, [sp] /* register 13 stored into memory */
mov x0, 1 /* printing code */
mov x1, sp
mov x2, 1
mov x8, 64
svc 0
b newline
nine_hundred:
udiv x21, x19, x20
msub x22, x20, x21, x19
add x13, x21, 48
strb w13, [sp]
mov x0, 1
mov x1, sp
mov x2, 1
mov x8, 64
svc 0
add x13, x22, 48
strb w13, [sp]
mov x1, sp
mov x2, 1
mov x8, 64
svc 0
newline:
mov x13, 10
strb w13, [sp]
mov x0, 1
mov x1, sp
mov x2, 1
mov x8, 64
svc 0
add x19, x19, 1
cmp x19, max
b.ne loop
add sp, sp, 16
mov x0, 0 /* status -> 0 */
mov x8, 93 /* exit is syscall #93 */
svc 0 /* invoke syscall */
.data
msg: .ascii "Loop: "
len= . - msg
The x13 stores all values into the stack for printing purposes, x21 store the ten's value, x22 stores the one's value and then similar to the code meant for single digits we print each digit separately. This leads to down the road problems of the code not being able to run if the number is over 100 but using this method we should be able to theoretically loop through this to make sure all the numbers are printed, but unfortunately that is over my head so I won't be attempting it here, though it sounds like a good challenge for when I have a better understanding of assembly. The result is here:

And everything works like intended. This lab was actually fairly difficult to understand at first since it was so much more intuitive and complex compared to 6502 that it took me a while to adjust but I must say I much prefer this over 6502 and it seems like there is a lot more you can do with access to more diverse instructions and more registers to play with. Well that's it for this lab, I hope you enjoyed reading this.
Comments