Chapter 17: Exception Basics
We call a CPU crash an Exception. However, not all Exceptions are CPU crashes. There are only 2 methods to stop the regular execution flow of the CPU. That is Branches and Exceptions. With Branches, you can see how the direction of the execution flow will change. The change is predictable. Even if the branches are conditonal branches, we know the branch will take one route or the other.
With Exceptions, the execution flow/path of the Program is forcefully changed. Some exceptions may occur "out of nowhere", but that does NOT mean the occurrence was random.
There are two types of Exceptions:
Synchronous Exceptions are exceptions that are directly caused by an instruction. For example, let's say we have a store instruction where the Effective Address is an invalid memory address. Once that store instruction gets executed, a Synchronous exception will occur. Regular execution flow of the CPU is redirected to a specific exception handler depending on the exact cause of the exception.
Asynchronous Exceptions are exceptions that are not directly associated with the current instruction. An example of this is a hardware timer. Such a timer can be set to a value and will decrement while a program is running. Once the timer hits 0, an exception will be taken. Regular execution flow of the CPU is redirected to a specific exception handler depending on the exact cause of the exception.
Exception-specific code is usually referred to as an "exception handler". Here's a basic diagram of of the program flow of when an exception occur.
However! This diagram is misleading, it leads you believe that once the code at the exception handler completes, the execution flow of the CPU immediately goes back to where it left off from the main program. This is usually not the case. What occurs is that the exception handler gets completed, then the CPU's execution flow will route back to the main program, but to a dedicated large section of code that is designed to execute after an exception handler. Once all of this relative code has completed, then the execution flow will finally go back to where the flow stopped when the exception initially occurred. Confusing, but just wanted to clear the air on the diagram.
Whenever an exception occurs on any PowerPC chip, it is executed in Real Mode. Real Mode simply means physical memory. Almost all programs execute their code in Virtual Memory. What is Virtual Memory? Essentially, it is a large list of memory addresses are used to represent physical addresses.
For example, let's look at a memory address of the Nintendo Wii.
0x00001500 = This is a physical address for the Nintendo Wii
0x80001500 = This is a virtual address, it represents 0x00001500.
As you can see, a simple bit flip of bit 0 determines whether the address is virtual or physical. We will dive more into address translation in Chapter 31. Just understand that....
The start address for every individual exception handler is universal for every PowerPC chip. Though not every PowerPC chip will use all the possible exception handlers (via hardware design). What's odd is that each exception handler can start from one of two possible physical addresses. First we need to discuss the Machine State Register to understand how this is even possible.
The Machine State Register (MSR) is a Special Purpose Register that holds important configuration of the PowerPC system. Such as address translation, floating point enabling, external interrupt enabling, etc. Each configuration is toggled via a bit. For example, bit 13 is what sets the Power Management mode. Here's a diagram of the MSR with bit abbreviation.
IMAGE
Now we could go over every bit and what it effects, but for this chapter, let's just focus on a few..
Bit 16: EE
This is the External Interrupt Enable bit. Get familiar with this! This will allow you to enable for disable the External Interrupt Exception Handler. This is useful for any code that is time dependent.
Here's an example code to disable External interrupts--
mfmsr r0 #Copy the MSR to r0 rlwinm r0, r0, 0, 17, 15 #Set bit 16 (EE) low. Leave all other bits alone. mtmsr r0 #Write the new MSR!
The mfmsr instruction is simple. It copies the MSR to a General Purpose Register. The mtmsr is simply the reverse. It copies the contents of a General Purpose Register to the MSR.
Here's an example code to enable External Interrupts--
mfmsr r0 #Copy the MSR to r0 ori r0, r0, 0x8000 #Set bit 16 (EE) high. Leave all other bits alone. mtmsr r0
Bit 19: ME
This is the Machine Check Enable it. it simply enables or disables the Machine Check Exception Handler.
Bit 18: FP
This is the Floating Point enable bit. If this bit is low, and a Floating Point instruction gets executed, then the Floating Point Unavailable Exception handler will be executed.
Bit 25: IP
Alright this bit is important. It determines the address of every Exception Handler. When this bit is high every Exception Handler will be located at 0xFFFnnnnn. When this bit is low every Exception Handler will be located at 0x000nnnnn. nnnnn is the Exception Handler offset. Most programs will have this bit LOW.
Assuming we have a LOW IP bit in the MSR, here's a list of all the PowerPC Exception handlers.
0x00000100 System Reset; Asynchronous For hard or soft resets of the CPU(s). 0x00000200 Machine Check; Synchronous Can be specifically disabled. To keep it simple for now, it's basically an "oh shit" exception 0x00000300 Data Storage Interrupt (DSI); Synchronous Mostly occurs due to storing or loading via an invalid Memory Address. 0x00000400 Instruction Storage Interrupt (ISI); Synchronous Basically an instruction was executed and said instruction was residing in protected/privileged/non-existent Memory 0x00000500 External Interrupt; Asynchronous A non-PPC device requested the PPC to take an exception (external interrupt) 0x00000600 Alignment; Synchronous A store/load instruction wasn't storing/loading via an aligned Address when required to do so 0x00000700 Program; Synchronous Usually this is due to an invalid/non-existent instruction being executed 0x00000800 Floating Point Unavailable; Synchronous Floating Point instructions were disabled and one got executed 0x00000900 Decrementer; Asynchronous Most PPC chips have a built in timer called the Decrementer. It is a Special Purpose Register that is always decrementing. Once it goes below 0, this exception handler gets executed. 0x00000C00 System Call; Synchronous Occurs whenever the system call (sc) instruction gets executed 0x00000D00 Trace; Synchronous Occurs whenever Tracing is enabled and a trace-enabled instruction gets executed.
There are other PPC Exception Handlers but they are implementation dependent.
You will need to get familiar with some more SPRs.
srr0 (Save and Restore Register 0) Generally speaking, for when most synchronous exceptions occur, this address will contain the virtual memory address of the instruction that caused the exception. srr1 (Save and Restore Register 1) Immediately after an exception occurs, certain bits are copied over from the MSR to srr1, and then the other bits are set low. For which specific bits are copied vs set low is dependent per each Exception. For some exceptions, certain bits in srr1 can tell you a more precise reason of why said Exception occurred. sprg0; sprg1; sprg2, sprg3 (Spare Registers) These registers are used when the program wants to preserve the state of some registers. dsisr (Data Store Interrupt System Register) This register tells you a more precise reason to why the DSI exception occurred. dar (Data Address Register) For DSI and alignment exceptions, the address of the instruction that caused the exception placed into this register. MSR (Machine State Register) You already learned about this register.
As a beginner PowerPC Coder, it's very common to write out code that will lead to numerous DSI Exceptions. Let's go over how easy it is for a DSI exception to occur.
Let's revisit the Nintendo Wii again. It has a valid memory range (virtual) of---
0x80000000 thru 0x817FFFFF, and 0x90000000 thru 0x93FFFFFF.
So let's say we want to load a word from Address 0x80001850, and our code to do that is...
lis r0, 0x8000 lwz r12, 0x1850 (r0)
And BOOM! A DSI exception will occur. Why? Well, recall back in Chapter 7 that certain instructions which use r0 as a source register will treat r0 as literal 0. This quirk applies to the lwz instruction.
Thus, the lwz instruction will load the word from 0x00001850, NOT from 0x80001850. Well 0x00001850 is NOT a valid virtual memory address. Therefore, a DSI exception occurs.
What happens when the DSI Exception occurs?
Programs will have code already written to the DSI Handler. Exception code, in general, is written at the "boot stages" of a program and is left unaltered afterwards. If you are writing an ENTIRE program from scratch (which is EXTREMELY rare mind you), you would be responsible for setting up such code.
Now that you have an overlook of the Exception Specific Registers, lets do an overview of what occurs when the sc (system call exception) gets executed. Here's a snippet of code...
mr r8, r11 add r3, r4, r5 sc lis r12, 0x8000
When the sc instruction gets executed, this occurs....
Now let's say we have a snippet of code at the System Call exception handler. We want it to....
The code would look like this....
#Disable FP instructions in the MSR for once the exception ends mtsprg0 r0 #Copy r0's original data to Spare Register 0 mfsrr1 r0 #Copy srr1 contents to r0 rlwinm r0, r0, 0, 19, 17 #Set FP bit low mtsrr1 r0 #Write new contents to srr1 msprg0 r0 #Restore r0's original data rfi #Leave the exception!
We see some instructions that are new to you. Let's cover those plus some more...
Move to Spare Register
mtsprgX, rD #This copies rD value to sprgX. X being any thing from 0 thru 3.
Move from Spare Register
mfsprgX, rD #This copies sprgX's value to rD. X being anything from 0 thru 3.
Move from srr1
mfsrr1 rD #This copies srr1's value to rD
Move to srr1
mtsrr1 rD #This copies rD's value to srr1
Return from Interrupt
rfi #This instruction is used to end any exception. When this instruction gets executed, srr1's value is copied to the MSR. Then the execution of the program will branch to the address in srr0.
Now in our snippet of code for the System Call exception handler, you may have already notice that there are zero instructions which modify srr0. Why is this? This is because for System Call exceptions the address of the instruction BELOW sc gets placed into srr0 by default. If we simply want to return from the exception to where we left off from, we can leave srr0 unaltered. However, for cases where you need to alter srr0, you have the following two instructions..
Move from srr0
mfsrr0 rD #This copies srr0's value to rD
Move to srr0
mtsrr0 rD #This copies rD's value to srr0
Let's pretend that we wanted our System Call exception handler code rewritten and we want to branch to address 0x80001500 once the exception is over, our source is now this...
#Disable FP instructions in the MSR for once the execption ends. Jump to 0x80001500 afterwards. mtsprg0 r0 #Copy r0's original data to Spare Register 0 mfsrr1 r0 #Copy srr1 contents to r0 rlwinm r0, r0, 0, 19, 17 #Set FP bit low mtsrr1 r0 #Write new contents to srr1 msprg0 r0 #Restore r0's original data lis r0, 0x8000 ori r0, r0, 0x1500 mtsrr0, r0 #Write 0x80001500 to srr0 rfi #Leave the exception! Execution will resume at address 0x80001500.
The above code examples aren't practical at all in the real world, but they should give you a beginner's understand of the Exception Handlers and Exception specific registers/instructions.
QEMU cannot accurately execute/emulate PowerPC exceptions. In fact, you won't be able to step thru any of the Exception Handlers. If you try, QEMU will skip over them. Let's say you are about to step a sc instruction. Once you do, you will be two instructions below in your program. You won't be at the System Call exception handler.
I could dive into the reasons why, but you wouldn't understand most of it until we cover other aspects of PowerPC. The QEMU issues will be more explained in Chapter 26.