Chapter 6: Instruction Format & Other Basics
Let's go over some bare-bone basics in regards to values and comments being written in Source Files
Writing Numerical Values in a Source File:
When writing decimal values, there are no special requirements. You can simply write out the number. For negative numbers, simply make sure the decimal value is appended with a minus symbol (i.e. -100).
When writing hexadecimal values, you *must* pre-pend the number with "0x". Example: 0xAC. Regarding negative hex numbers, that will be covered in the next Chapter because there are technically two different ways to write out negative hex numbers.
You have the ability to write values in binary form. To do this, you *must* pre-pend the number with "0b". Example: 0b0101
Hex vs Decimal for writing Values:
When writing Instructions in a Source File for the Assembler, some instructions require a numerical value to be written out. If the value is small enough to be shown as a hex byte, then it's easier in my opinion to write it in Decimal form. For the case you need to write out Negative numbers, writing in Decimal form is fine too. Other than those situations, it's easier (visually) to use Hex over Decimal. However, at the end of the day, this is User Preference. Choose what's easier for you.
Writing notes/comments in the Assembler:
There are two different methods to write notes/comments.
Examples:
# This is a comment. /* This is a comment. */
The hashtag method requires the hashtag symbol on every line where a comment is placed. The 2nd method allows you some flexibility like this....
/* This is a comment.*/
You can use notes/comments adjacent to instructions or in their own line within a Source File to help explain why certain instructions were utilized or what purpose they serve.
Now let's move onto basic Instruction Format. When writing Instructions in a Source File, certain Format(s) must be followed or else the Assembler will output an error. Let's go over the GPRs.
rX
X = Register's number. For example, GPR 5 is r5. GPR 22 is r22.
In every instruction, there is a Destination Register. In most instructions, the Destination Register is the Register that holds the result of an executed instruction, while the Source Register is the Register that is used to compute the result for the Destination Register. Some instructions will have one source register, while others will have two. Every instruction has only one Destination Register.
For most instructions, there are essentially 4 main formats...
Format 1:
ins rD, rA, rB
ins = Instruction Mnemonic
rD = Destination Register
rA = 1st Source Register
rB = 2nd Source Register
Keep in mind this is not an actual instruction. This is just to show you a very general view of any instruction that uses two source registers to compute a value for the destination register. Now let's look at an example of an instruction with just one source register..
Format 2:
ins rD, rA
ins = Instruction Mnemonic
rD = Destination Register
rA = Source Register
Moving onto another Format that also uses 1 source Register, but also comes with something a bit different..
Format 3:
ins rD, rA, VALUE
ins = Instruction Mnemonic
rD = Destination Register
rA = Source Register
VALUE = Immediate Value
Now let's look at an example with zero source registers...
Format 4:
ins rD, VALUE
ins = Instruction Mnemonic
rD = Destination Register
VALUE = Immediate Value
ins aka Instruction Mnemonic is a symbolic code for the instruction's name. In simpler terms, an abbreviation or alias. For example, PowerPC has an instruction named Multiply Immediate. However, it's instruction mnemonic is mulli. The Assembler only understands the mnemonic.
Immediate Value (VALUE) is a 16-bit numerical value that is **not** represented by what's in a Register. Think of it as writing a Value from 'scratch'. The implementation of Immediate Values allows the PowerPC language to have instructions that can provide more flexibility with less register usage.
Before continuing further into Immediate Values, it's vital that you understand Signed vs Unsigned values.
What is Signed & Unsigned?
Signed values means negative numbers are possible while Unsigned values mean negative numbers are impossible.
The entire number range in a GPR is 0x00000000 thru 0xFFFFFFFF.
Signed Range of Numbers in a GPR:
0x80000000 thru 0xFFFFFFFF = Negative Numbers. 0x00000001 thru 0x7FFFFFFF = Positive Numbers (if you don't include zero)0xFFFFFFFF is -1 in decimal representation.
0xFFFFFFFE = -2
0xFFFFFFFD = -3
etc etc til you reach 0x80000000 which is the 'largest' negative number possible.
So a left to right visual would look like this...
0x80000000 --> 0x00000000 --> 0x7FFFFFFF
Unsigned Range of Numbers in a GPR:
0x00000001 thru 0xFFFFFFFF = All Positive Numbers (if you don't include zero)
The above ranges present a problem. How do we know if a value is being used as Signed or being used as Unsigned? For example, is 0xFFFFFFFF being used as -1 or being used as 4,294,967,295? Well for a majority of instructions, there is no specificity of Signed vs Unsigned treatment because it doesn't make a difference to the result/output of said instructions. However, there are certain instructions (like Multiply and Divide) which this does indeed matter, and we will address those Signed Vs Unsigned issues in the next Chapter. For now just understand how a number in a GPR can represent two different values.
Now we need to move onto Signed Vs Unsigned Numbers for Immediate Values. Since Immediate Values are 16-bits in size instead of 32-bits, their range of numbers will differ.
Immediate Value 16-bit Signed Range (known as SIMM): 0xFFFF8000 thru 0xFFFFFFFF = Negative Immediate Values (-32768 thru -1) 0x0001 thru 0x7FFF = Positive Immediate Values; not including zero (1 thru 32767)Left to Right visual:
0xFFFF8000 --> 0x0000 --> 0x7FFF
Immediate Value 16-Bit Unsigned Range (known as UIMM):
0x0001 thru 0xFFFF = All Positive Immediate Values; not including zero (1 thru 65535)
You will notice right away that negative Immediate Values are not 16-bit in size. This is a 'trick' that allows a PPC CPU to have negative 16-bit values displayed inside a 32-bit register. When writing these Immediate Values for the Assembler, you must follow the ranges shown above, or else an assembling error will occur.
Certain instructions will use the Signed range while other instructions will use the Unsigned range, it all depends on the certain instruction in question. It's impossible for an instruction to allow the use of both Ranges, it will be one or the other.
Signed Immediate Values are known as SIMM. Unsigned Immediate Values are known as UIMM. The terms SIMM and UIMM are important, so remember what they mean!
Some considerations and final notes:
Alternatively, you can write the numbers in Decimal form into your Source File(s). You can also write out negative hex numbers with the minus (-) symbol. For example, you can write out -1 in Hex as -0x1.
Fyi: An alternate term for Unsigned is Logical