PowerPC Tutorial

Previous Chapter

Chapter 23: Functions

Section 1: Intro

This will be by far the longest Chapter yet of the PPC Tutorial. Take your time, refer back to previous Chapters if necessary. Grab some snacks too.

What is a function? A function is a subroutine of code that accepts input(s), processes the inputs to perform a task(s), and give back output(s) depending on how the task(s) went. Of course, this is very general, some functions don't take an input (task performed is always the same), and some functions don't provide an output(s).

To execute a function, you "call" it. This means you have to setup the proper inputs beforehand. Once the inputs have been set, you can call the function. In general, functions are very large and contain complex tasks that should not be handwritten. The C library has many built in functions such as printf (print text to console/terminal). Recall back in Chapter 9 that you can assemble a Source to include C/C++ functions that you can use in your Program.

You can of course write your own functions for custom tasks, but when I speak of functions for this Chapter, it's referring to the very complex ones that should *not* be handwritten.

Below is a small complete Program that simply prints "Hello World!" to the terminal and then exits. We will use a lot of what we've learned in the past Chapters. If you've done C/C++ tutorials, some of this may be familiar territory.

.section .text
    .globl main

main:
#Prologue
    stwu sp, -0x0010 (sp)
    mflr r0
    stw r0, 0x0014 (sp)
    
#Print the String
    lis r3, string_ptr@ha
    la r3, string_ptr@l (r3)
    bl printf

#Return back 0 (success, exit program)
    li r3, 0
    
#Epilogue
    lwz r0, 0x0014 (sp)
    mtlr r0
    addi sp, sp, 0x0010
    blr

#Directive for our string
string_ptr:
    .asciz "Hello World!\n"

In Chapter 9, we briefly discussed the "main" function in a demo file called "example2.S". The uppercase S was used so when the file was assembled (via being compiled with the PPC C Compiler), the C library functions were included. When this is done, your Program starts in a Function called "main". Let's go thru each portion of the above Program in a bit more detail.

.section .text
    .globl main

main:
In simple terms these are Assembler Directives that tell the Assembler/Compiler that the contents below are of instructions, and that the function below is the "main" function (start of our Program).

#Prologue
    stwu sp, -0x0010 (sp)
    mflr r0
    stw r0, 0x0014 (sp)
The start of most functions being with a prologue, we'll go over more of this in Section 2.

#Print the String
    lis r3, string_ptr@ha
    la r3, string_ptr@l
    bl printf
This portion of the Program sets up what is called an Argument and then executes a C function known as printf. Section 4 of this Chapter will go into further detail.

#Return back 0 (success, exit program)
    li r3, 0
This has to do with Function Return Values which will be covered in Section 5. The "main" function must always end with r3 being 0 or 1.

#Epilogue
    lwz r0, 0x0014 (sp)
    mtlr r0
    addi sp, sp, 0x0010
    blr
If a function has a Prologue, it needs an Epilogue.

#Directive for our string
string_ptr:
    .asciz "Hello World!\n"

You have already learned about Assembler Directives back in Chapter 18. This simply creates a String that our Program can use at anytime.

Question: Vega, why does the String end in "\n"?

Answer: The "\n" means new-line. If we print a string, and let's say we print another new string afterwards, the strings will be touching on our console/terminal. The new-line will do one "keyboard enter". Also, very important, when using QEMU+GDB with the printf function, if this new-line attribute is not used, the console's contents won't immediately update when debugging.

Let's go ahead and assemble our program. We will use the PowerPC C Compiler since we will be using C functions. Copy paste the Program above and call it Hello_World.S. Remember that the file extension is a capital "S". This is needed as it tells the Compiler to include the C functions. If we don't, the Compiler will have no idea what the hell "printf" even is. Open a terminal in the directory where Hello_World.S is located at, and run this command..

powerpc-linux-gnu-gcc -mregnames -ggdb3 -o Hello_World -static Hello_World.S

Let's break down each element of the command..

We don't need to bring up GDB as I just want you to simply run the program. Run the next two commands...

chmod +x ./Hello_World
./Hello_World

The PowerPC program will run on your non-PPC Linux OS because you've installed QEMU-Static back in Chapter 2. This allows you to run programs that are not of your Computer's Architecture.

You should see the terminal print the string and immediately exit...

As an fyi: The chmod +x terminal command makes the Object File be an Executable File (in regards to Linux OS permissions). If you don't include this command and try to run the Program, your Linux OS may bicker at you saying it's not executable.

We will revisit this little Program later, let's start learning about Functions..


Section 2: Prologues & Epilogues

A majority of functions start with code that is known as a Prologue. A Prologue is simply a set of instructions that will modify the SP (r1) register for the purpose of backing up some registers, and/or allocating some Memory for usage.

Template of a Prologue:

stwu sp, -0xXXXX (sp) #The first instruction of most prologues almost begin with a store-word-update instruction where the SP register is both the destination and source register.
mflr r0 #Prologue always needs to save the LR, it uses r0 as a scratch register for that purpose
stw r0, 0xXXXX (sp) #Now the LR is saved.

A majority of functions end in an Epilogue. An Epilogue is essentially the opposite of the Prologue. It will retrieve back any old data that was previously saved during the Prologue. Any allocated memory will be de-allocated. Also, fp, lr, and sp will be set to their pre-Prologue values.

Template of an Epilogue:

lwz r0, 0xXXXX (sp) #The load of the saved LR will usually be first load instruction of the epilogue.
mtlr r0 #Placed saved LR backed into the Link Register
addi sp, sp, 0xXXXX #Make SP be its value to what is was before the function was ever called
blr #always last instruction of an epilogue

You will learn how to create your own prologues+epilogues in the next Chapter. For now, this information is provided so you know how to spot a function if you are looking thru some code in a Program.


Section 3: Important Branch Instructions

We won't go over all the instructions in detail in this Chapter. I'm simply providing the templates so you know how to spot a Function in a Program. We do need to cover 2 simple instructions. Functions, for the most part, are called via a bl instruction.

Branch & Link:
bl label

This is just like a regular unconditional branch (b label) instruction except the address of the instruction, that is *underneath* the bl instruction, is placed into the Link Register. Let's look back at a small portion of our Hello_World Program.

Example:

    bl printf
#Return back 0 (success, exit program)
    li r3, 0

In the above example, once the bl instruction executes, the CPU will jump to the location of the "printf" function, and then the Link Register will be written with the address of the location of the li instruction.

Let's say the "bl printf" instruction resides in memory at address 0x4000A000. When it gets executed, the address value of 0x4000A004 (where the "li r3, 0" instruction resides at) will be placed into LR.

This is important to understand because this mechanism is how functions can return back from where ever they were called from. As you saw earlier in our Hello_World, the Epilogue always ends in the blr (branch to link register instruction).

Branch to Link Register:

blr #Branch to Memory Address in the LR

The blr instruction is just an unconditional branch, but it branches to the Memory Address in the Link Register.

So very broadly, a program calls a function like this...

  1. Setup Input Values
  2. bl function_location
  3. Prologue of Function is executed
  4. Function executes task(s)
  5. Epilogue of Function is executed
  6. blr (last instruction of Epilogue)
  7. Execution of CPU returns back to where it left off

Section 4: Function Arguments

We've talked about how most functions need input value(s) beforehand. We call those Arguments.

When a function requires Arguments, certain GPRs (and/or FPRs) must be filled with certain values (exact values depend on the function and what task is be done with said function). r3 is used for the 1st argument, r4 is used for the 2nd arg, etc etc..... View list below.

Of course, the above list is entirely function dependent. If a function requires zero args, then the above is completely irrelevant. If a function requires 3 GPR args, then args go in r3, r4, and r5. If a function requires 1 GPR arg and 2 Float Args, then arguments go in r3, f1, and f2.

Let's once again refer back to our Hello_World program. The printf function requires 1 (or more) Arguments. For a basic string (i.e Hello World!"), only 1 Argument is required. That is the Pointer to the string.

#Print the String
    lis r3, string_ptr@ha
    la r3, string_ptr@l
    bl printf
    ...
    ...
    ...
#Directive for our string
string_ptr:
    .asciz "Hello World!\n"

As you can see since there was only 1 argument, it goes into r3. Printf *can* have more than 1 argument. These additional arguments are notated within the string via what is called a Format Specifier.

Format specifier follows this structure~
%[flags][width][.precision][length]specifier

This page HERE does a great breakdown of printf and format specifiers. View the specifier chart on the site. You can see how these format specifiers can specify something such as an integer value. We can make the printf function display desired integer values by...

  1. Having the format specifier(s) in the string
  2. Supplying the value(s) for the specifier(s) via Register Arguments

Let's make our Program a bit more advanced. We'll add some format specifiers to our String.

.section .text
    .globl main

main:
#Prologue
    stwu sp, -0x0010 (sp)
    mflr r0
    stw r0, 0x0014 (sp)
    
#Set up all Args, then call Printf
    lis r3, string_ptr@ha
    la r3, string_ptr@l (r3)
    li r4, 100 #Format specifier 1's value
    li r5, 5 #Format specifier 2's value
    bl printf

#Return back 0 (success, exit program)
    li r3, 0
    
#Epilogue
    lwz r0, 0x0014 (sp)
    mtlr r0
    addi sp, sp, 0x0010
    blr

#Directive for our string
string_ptr:
    .asciz "Hello, I am %d years old. I own %d cars.\n"

Save the above as Hello2.S. Assemble the program and run it.

powerpc-linux-gnu-gcc -ggdb3 -o Hello2 -static Hello2.S
chmod +x ./Hello2
./Hello2

You should see the following...

In the Program's Source, we can see that the string in the example contains 2 format specifiers. Therefore it has 3 total arguments. r3 contains the memory address (pointer) of the string. r4 contains the value we will use for the 1st specifier (1st %d). r5 contains the value we will use for the 2nd specifier (2nd %d).

The use of specifiers allows us to change what values we can apply to the string without having unique strings for every unique combinations of values. Anyway, when the string does print to the console, it will be as such (with a new line being entered into)...

Hello, I am 100 years old. I own 5 cars.

Let's look at some pics for you to get a better idea. Here's a pic of the above right before the bl instruction is going to be executed.

We can see in the above pic that r3 (designated by green arrow) contains the Memory Address that points to our String. r4 contains the integer value used for the first format specifier (designated by red arrow). r5 contains the integer value used for the second format specifier (designated by blue arrow). We see the string is outlined in magenta, I drew the outline to include the null byte after 0xA (0x0A is the ASCII byte to enter into a new line below). The 0xA (new line) byte is included because if you don't, and you print another string to the console/terminal, the strings will "touch" together.

We see the hex contents of....

48 65 6C 6C 6F 2C 20 49 20 61 6D 20 25 64 20 79 65 61 72 73 20 6F 6C 64 2E 20 49 20 6F 77 6E 20 25 64 20 63 61 72 73 2E 0A

If you take the above hex values and plug them into a Hex to ASCII converter, it will display our String.

We can issue a step (withOUT the i) command to step the entire printf function in one go. View pic below (step has NOT been entered yet)...

Now press enter to execute the step...

Once you do that, you should see the string print on the qemu terminal...

Going back to GDB let's see what has happened...

We see r3 (outlined in green) contains the byte length of the produced string (excluding the required ending null byte). This is printf's return value. Let's now dive into Return Values....


Section 5: Return Values

Most functions will return value(s) to let you know if the task was successful, denied, failed, etc. Return values are almost always returned in r3 and/or f1. If necessary, though very rare, more return values are placed into r4 and/or f2. Usually a function will write the return values to the appropriate registers within the Epilogue or right before the Epilogue.

The printf function is not an ideal function to showcase return values as normal programs never check printf's return value as printf never fails (natively). A better function to demo this would be the fopen function. This is also a C/C++ library function. It needs to be called before you read/write to any files on an operating system (i.e. a txt file).

Here's a basic program that open's a file called "myfile.txt", and then closes it. If the file exists, the string "File opened and closed" will be printed. If the file doesn't exist or an error occurred, the string "Error!" will be printed instead.

.section .text
    .globl main

main:
#Prologue
    stwu sp, -0x0010 (sp)
    mflr r0
    stw r0, 0x0014 (sp)
    
#Open the file
    lis r3, filename_ptr@ha
    la r3, filename_ptr@l (r3)
    lis r4, perms_ptr@ha
    la r4, perms_ptr@l (r4)
    bl fopen
    cmpwi r3, 0
    beq- print_error

#Close the file
    bl fclose
    
#Print the Good String
    lis r3, success_string@ha
    la r3, success_string@l (r3)
call_printf:
    bl printf

#Return back 0 (program ran as it was expected to, return r3 as 0)
    li r3, 0
    
#Epilogue
    lwz r0, 0x0014 (sp)
    mtlr r0
    addi sp, sp, 0x0010
    blr
    
#Portion of Program to print Bad String
print_error:
    lis r3, error_string@ha
    la r3, error_string@l (r3)
    b call_printf

#Directives
filename_ptr:
    .asciz "myfile.txt"
perms_ptr:
    .asciz "rb"
success_string:
    .asciz "File was opened and closed.\n"
error_string:
    .asciz "Error!\n"

Save the above as filetest.S. We will run the Program soon, but let's go over a few things regarding File Management.

fopen requires 2 args:

We see below the portion of our Program that sets the two Args and calls fopen

#Open the file
    lis r3, filename_ptr@ha
    la r3, filename_ptr@l (r3)
    lis r4, perms_ptr@ha
    la r4, perms_ptr@l (r4)
    bl fopen
    ...
    ...
    ...
    
#Directives
filename_ptr:
    .asciz "myfile.txt"
perms_ptr:
    .asciz "rb"
    ...
    ...

Filepath is a string. For example if you are on a Linux OS and you have a file called "example.txt", and its located at a directory called "/home/Bob/Downloads", then its filepath is "/home/Bob/Downloads/example.txt". You would need this string somewhere in your Program. However, a filepath string can be shortened when referencing the filepath from the Program's Location. If example.txt resides in the same directory as the Program, then the filepath can be just "example.txt" or "./example.txt". Those who are familiar with Linux directories will know that the "." means "here".

Permissions is also a string. To keep it simple, the string of "wb" is for write permissions, and the string of "rb" is for read permissions. You would need these strings somewhere in your Program. For our Program it doesn't really matter which permissions to use because it's a test of open-then-close.

Now that you understand how to call fopen, let's go over its return values. If fopen succeeds, it gives back a Pointer. This Pointer is known as a File Stream Pointer. To keep it short, this Pointer is needed for other File-Based functions (i.e. fread, fwrite, fclose). If fopen fails, a null pointer (0x00000000) is given back. Let's review the portion of the Program that checks the r3 return value.

    bl fopen
    cmpwi r3, 0
    beq- print_error

We see that the cmpwi instruction is used to check for Null Pointer. If r3 = 0, then execution of the CPU branches to the part of the Program that will print the "Error!" string.

Unlike fopen, other functions may return 0 for success, and return a negative value for failure. One way to check those type of return values is like this...

cmpwi r3, 0
blt- some_error

Let's review one more part of the Program before you run it..

#Close the file
    bl fclose

Whenever you open a file, it must be closed before the Program exits/ends. fclose requires 1 arg and that is the File Stream Pointer that was discussed about earlier. You may have noticed that there are zero instructions that setup the r3 arg for the call of fclose in our Program. This is because right after fopen returns, the File Stream Pointer (if given back) is already sitting in r3. Since fclose is the very next function to call, our r3 arg is already set, and we can just call fclose.

fclose returns back 0 for success, or a negative value for failure. Most programs don't check fclose's return value. This is because fclose basically never fails. If the file was opened successfully, it will always close successfully. The only exception to this is for handling files with "syncing". This means you have a Program where the edits made to files must be updated immediately. Syncing vastly degrades performance.

Enough blabber! Run the program!

powerpc-linux-gnu-gcc -mregnames -ggdb3 -o filetest -static filetest.S
chmod +x ./filetest
./filetest

You will receive the Error string because myfile.txt doesn't exist. Go ahead and create a blank new myfile.txt. Be sure that it resides in the same directory as the Program. Also, it needs to have User Permissions (linux perms). Update the perms (if necessary) and re-run the Program

chmod a+rwx ./myfile.txt
./filetest

You should see the "File was opened and closed" String.


Section 6: Conditional Function Calling

Most functions are called by the bl instruction. However, you are able to call functions conditionally. The "link" feature can be implemented with conditional branches...

Examples~

beql somefunc #Branch to somefunc if condition of "is equal" was met. Address of instruction directly below beql is placed into LR
bltl somefunc #Branch to somefunc if condition of "is less than" was met. Address of instruction directly below bltl is placed into LR
blel somefunc #Branch to somefunc if condition of "is less than or equal" was met. Address of instruction directly below blel is placed into LR

Any conditional branch instruction can be supplemented with the "link" feature. Since these are conditional branches, branch hints (+/-) can be applied.

Not only that, some functions may be called via the bctrl and blrl instructions.

Branch to CTR and Link
bctrl #This will do a branch to the value (Address) in the CTR. The address of the instruction directly below the bctrl is then placed into the Link Register.

Branch to LR and Link
blrl #This will do a branch to the Address in the LR. The address of the instruction directly below the blrl is then placed into the Link Register.

The two above instructions may be used because some function's location in the Program is too far away for a bl instruction to be viable (bl's SIMM limit exceeded), or it is faster for a program to load a function address (from some look-up table) based on a result/value instead of sorting thru a series of compare+branches. For example....

cmpwi r3, 1
beql- func1
b the_end
cmpwi r3, 2
beql- func2
b the_end
cmpwi r3, 3
beql- func3
b the_end
cmpwi r3, 4
beql- func4
b the_end
cmpwi r3, 5
beql- func5
bl func6
the_end:
That code above could instead be written like this..
func_table:
.long func1
.long func2
.long func3
.long func4
.long func5
.long func6
...
...
...
lis r12, func_table@ha
addi r12, r12, func_table@l #Set base pointer to func_table
subi r3, r3, 1 #Subtract result/value (what determines which func to call) by 1 because 1st func is at offset 0 of the table
slwi r3, r3, 2 #Multiply net result by 4 since each func address is 4 bytes in length
lwzx r12, r12, r3 #Load func address
mtctr r12 #Place func address in CTR
bctrl #Branch to CTR and Link

Finally, this is very rare, but functions can be called via the LR/CTR conditionally...

Examples~

beqctrl+
bltlr-
blectr

Section 7: Calling Convention & Terminology:

Note: Caller vs Callee will be explained shortly

We need to discuss how all the GPRs and FPRs are categorized within the PowerPC architecture.

r0 is an overall scratch register. You are already familiar with its literal-zero rule in some instructions.

r1 (aka SP) is the stack pointer, you will learn more about this in the next chapter.

r2 (aka rtoc) is the Register Table of Contents. It is used as a pointer to some important/globalized read-only data. It's value is written during the early boot-process of a Program. Once set, it's left unaltered.

r3 thru r10 are known as the Parameter & Result Registers. Obviously, these are the GPRs used for Arguments & Return Values in Functions.

r3 thru 10 are also called the caller saved temporary registers. If there is data in these registers that need to stay intact while a function is being called, the caller must place the data somewhere safe *BEFORE* calling the function. The values in these registers are not saved throughout functions.

r11 & r12 are intro-procedural scratch registers. Sometimes r12 is used for function linking where a bl instruction for a function won't work because the jump will be too far. Otherwise they are used for scratch during a prologue or epilogue.

r13 (aka got) is the Global Offset Table register. It is used as a pointer to a table of important read/write data. It's value is written during the early boot-process of a Program. Once set, it's left unaltered.

r14 thru r31 are the callee saved registers. These registers get preserved throughout function calls. These registers must be backed-up/saved by the Callee before being used (during the Prologue). The Caller can place data in these registers BEFORE a function is executed. When said function has returned, the data is intact. These registers are also known as the Non-Volatile Registers. Sometimes, though rare, they may be referred to as the Global Variable Registers.

LR is the link register. You already know what this is. It holds the address for the function to know where to return back to once it's completed.

f0 is a scratch register.

f1 thru f13 are the Parameter & Result Registers for Float specific Args & Return Values.

f14 thru f31 are the Float specific callee saved registers (similar to r14 thru r31 for GPRs).


Caller vs Callee

Caller vs Callee?. What do these terms mean?

Let's say you have a program and it calls a function. The "code" or that portion of the program that sets the inputs, calls the function, and checks the outputs, is known as the CALLER.

The CALLEE is the instructions within the function itself, which of course includes the prologue and epilogue.

Keep in mind the Callee can become a new Caller. If a function contains a function within itself, this becomes the case. The function that calls the new function is the caller. The new function is the callee.

The Caller is responsible for...

The Callee is responsible for...

An alternative name for Caller is Parent. And for Callee is Child.


We will go over some examples of how the Callee can save registers before it needs to use them for its own function calls (when it becomes a new Caller), but in order to understand that, we need to learn about the Stack first.


Next Chapter

Tutorial Index