Welcome, Guest
You have to register before you can post on our site.

Username
  

Password
  





Search Forums

(Advanced Search)

Forum Statistics
» Members: 444
» Latest member: Phantom
» Forum threads: 1,500
» Forum posts: 8,962

Full Statistics

Online Users
There are currently 59 online users.
» 0 Member(s) | 57 Guest(s)
Bing, Google

Latest Threads
Make it to 10,000
Forum: General Discussion
Last Post: Vega
9 hours ago
» Replies: 3,239
» Views: 2,358,085
PowerPC Page Tables Tutor...
Forum: PowerPC Assembly
Last Post: Vega
12-01-2022, 08:47 PM
» Replies: 0
» Views: 72
Allow invalid ghosts [jaw...
Forum: Time Trials & Battle
Last Post: jawa
11-29-2022, 02:53 PM
» Replies: 0
» Views: 40
Ultimate Stockpile Items ...
Forum: Incomplete & Outdated Codes
Last Post: Deez Nutz
11-28-2022, 02:43 PM
» Replies: 0
» Views: 45
Boot into any menu [Melg]
Forum: Misc/Other
Last Post: Zeraora
11-26-2022, 10:34 PM
» Replies: 3
» Views: 1,403
Coding Questions and othe...
Forum: Code Support / Help / Requests
Last Post: Vega
11-25-2022, 04:54 PM
» Replies: 121
» Views: 44,564
Remove invisible walls [j...
Forum: Visual & Sound Effects
Last Post: jawa
11-21-2022, 02:53 PM
» Replies: 5
» Views: 185
Hey all!
Forum: Introductions
Last Post: Vega
11-20-2022, 02:31 PM
» Replies: 1
» Views: 390
Fast Race Music Modifier ...
Forum: Visual & Sound Effects
Last Post: Zeraora
11-16-2022, 11:24 PM
» Replies: 0
» Views: 89
Normal Race Music Modifie...
Forum: Visual & Sound Effects
Last Post: Zeraora
11-16-2022, 11:22 PM
» Replies: 0
» Views: 80

 
  PowerPC Page Tables Tutorial
Posted by: Vega - 12-01-2022, 08:47 PM - Forum: PowerPC Assembly - No Replies

PowerPC Page Tables Tutorial

Works for most 32-bit PPC chips including Broadway

Requirements:


I'm making this tutorial due to a recent uptick in conversation on Discord PPC-related servers about how to set these things up. There appears to be some confusion among some ppl. This tutorial should clear things up.

Under normal circumstances the BAT registers are enough to map out all the memory you need. However if that is not the case, then you will need to add in a Page Table.



Chapter 1: Fundamentals

What is a Page Table?

A page table is a region of memory that contains blocks/sections of data of what is called Page Table Entry Groups (PTEG). Each PTEG contain 8 Page Table Entries (PTE). Every PTEG address is 64-byte aligned. Each PTE within a PTEG is 64-bits in length (double-word) and will contain the necessary information that is required for a proper address translation (such as the upper bits of the physical address equivalent and WIMG bits)

PTE bit breakdown~
Upper 32-bits
  • bit 0 = V aka Valid bit
  • bits 1 thru 24 = VSID
  • bit 25 = H (Hash Value 2 Used)
  • bits 26 thru 31 = API (Bits 4 thru 9 of the Virtual Effective Address)

Lower 32-bits
  • bits 0 thru 19 = RPN aka Real Page Number. It's Upper 20 bits of Physical Address to be used in the Translation
  • bits 20 thru 22 = Reserved (Must be Null)
  • bit 23 = R
  • bit 24 = C
  • bits 25 thru 28 = WIMG (what you would use in BATs)
  • bit 29 = Reserved (Must be Null)
  • bits 30 & 31 = PP (what you would use in BATs)

V (Valid) bit is simple enough to understand. If the PTE is invalid, it won't be used by Broadway. Broadway will try to look for another valid PTE. 

VSID (Virtual Segment ID) is a randomly generated identifier used as an input to calculate what is called Hash Value 1 (more on this in Chapter 6)

H (Hash Value 2) is when a 2nd hash (Hash Value 1 failed) had to be computed. more on this in Chapter 6.

R (Referenced) and C (Changed) are bits that get updated by Broadway to keep history information of the PTE, you do not need to worry about how Broadway updates these

WIMG and PP bits are what they would be in BAT Registers (write-through, cache-inhibit, memory-coherence, guarded)

--------

Special purpose registers known as Segment Registers contain the VSIDs. Permission related bits are also present and will override the PP bits in a PTE. There are 16 segment registers (sr0 thru sr15).

SR bit breakdown~
  • bit 0 = Must be 0 or else the SR will be used for an I/O device
  • bit 1 = Supervisor protection bit (supervisor cannot read/write memory)
  • bit 2 = User protection bit (user cannot read/write memory)
  • bit 3 = No execute bit (instructions in memory will not be executed)
  • bits 4 thru 7 = Reserved (Must be Null)
  • bits 8 thru 31 = VSID

If multiple SR's are to be used, then each SR must have a unique randomly generated VSID. You can have software generate these from calling some rand function, or have them predefined (generated by a third party) in a source.

To break it down very generally, address translation occurs as such....

  1. Portions of the Effective (Virtual) Address are broken apart
  2. Based on the value of the certain portions, a select SR will be used.
  3. The SR is then broken into portions
  4. SR Portions are used with what's in SDR1 Register to generate some hashes
  5. Special hashes will calculate the PTEG Physical address (where to navigate within the Page Table)
  6. Each PTEG has 8 PTEs. They are all checked to see if a Valid one exists
  7. Valid PTE contains the information on how to translate plus the memory properties (WIMG + PP bits)
  8. Translation (physical address that will be used) is preformed by taking the RPN bits in the found PTE and concatenating that with the lower 12 bits of the original Effective Address

Here is a chart of minimum recommended attributes for all allowed Page Table sizes~

[Image: pagetable01.png]

Page Tables cannot cover less than 8MB or more than 4Gbyte of memory.

Each PTE covers translation for 4 KB of memory. For a chunk of memory that is 16Mbytes in size, it would require 4,096 PTE's. Since each PTE is 8 bytes in size. That would mean 32Kbytes of memory are required to be allocated for the Page Table as a whole. However due to collision possibilities, you would need at least 4 times this amount. In conclusion to cover 16Mbytes of total memory, you would need to allocate 128Kbytes of memory for the page table.

SDR1 is a special purpose register that contains the very start address of the entire Page Table and the input values for the special hashes that are used to calculate the PTEG address.

SDR1 bit breakdown~
  • Bits 0 thru 15 = HTABORG
  • Bits 16 thru 22 = Reserved (must remain null)
  • Bits 23 thru 31 = HTABMASK

HTABORG is the physical address of where the very start of the Page Table resides at. In the above chart, the x's are don't care bit values. Meaning they have no restrictions on what they can be when setting the physical start address. The larger the covered Memory Size, the more right-justified zero bits are required in the physical start address.

Bits 7 thru 15 within HTABORG is known as the "Maskable Bits". Meaning however many zeroes were required is the amount of high/one bits are required to be set in HTABMASK.

As an fyi, BATs are faster than Page Tables. They also take priority over Page Tables. If a virtual address translation falls under both BAT and Page Table translation, the BAT will be used. This means you can setup two different virtual address's to translate to the same physical address (i.e. 0x80001500 -> 0x00001500 w/ Bat and also have 0xA0001500 -> 0x00001500 w/ Page Table)



Chapter 2: Allocating Memory

You will need quite a bit of memory for your page table entries, especially if you are planning to cover something such as 1+GB of virtual memory. For Mario Kart Wii, you can use something such as Egg::Heap::Alloc function to purchase you some memory for this (read note below)...

Example PAL (fill in mem_needed byte value)~

Code:
.set egg_alloc, 0x80229814
lis r12, egg_alloc@h
ori r12, r12, egg_alloc@l
mtlr r12
lis r3, mem_needed@h
ori r3, r3, mem_needed@l
lis r4, mem_needed@h
ori r4, r4, mem_needed@l #This is actually alignment. It needs to match r3
lwz r5, -0x5CA0 (r13) #PAL specific for Egg-Alloc
lwz r5, 0x0024 (r5)
blrl

In the above code, r3 returns pointer to Allocated Heap. Be sure to make this address physical before writing it to the HTABORG bits of SDR1.

IMPORTANT NOTE: The above code may not work for very large memory chunks due to natural function limitations. Function does work when asking for a 0x10000 chunk of memory with 0x10000 (64KB) required alignment



Chapter 3: SDR1 Configuration & TLB Invalidation

IMPORTANT NOTE: Be sure interrupts are masked (off) the entire time you are working on anything Page Table related (SDR1, SR, PTE construction, etc).

Before any page tables can be constructed, the TLB (Translation Lookaside Buffers) must be invalidated. TLBs are buffers in a on-chip unit that keep track of recently used PTEs. You cannot read/write to these directly. The only action you can do to them is invalidate a TLB by its index number, or issue a tlbsync instruction to wait for all/any TLB invalidations to complete.

SDR1 configuration must be done in real mode (reference: PPC PEM Book Page 2-42 footnotes for Table 2-22). Once SDR1 has been configured, you can invalidate the TLBs. There are a total of 64 TLBs. Each TLB is referenced by an index number that is contained in bits 14 thru 19 of the Effective Address used in the Register of the tlbie (TLB invalidate entry) instruction. The first TLB starts at index 0 and ends at index 63. The following snippet of code configures SDR1 and then invalidates TLBs. It assumes you went into Real Mode via rfi with EE, IR, and DR of the MSR set low

Code:
#Setup SDR1
lis r3, sdr1_value@h #Remember that the page table root/start address (HTABORG) needs to be physical
ori r3, r3, sdr_value@l
sync #Required per page 4-43 table 2-23 of the PPC PEM Book
mtspr 25, r3 #SDR1's SPR number is 25
isync #Required per page 4-43 table 2-23 of the PPC PEM Book

#Invalidate TLBs
li r0, 64
li r3, 0
mtctr r0
inval_tlb:
tlbie r3
addi r3, r3, 0x1000
bdnz+ inval_tlb
tlbsync #Required per page 202 section 5.4.3.2 of the Broadway Manual



Chapter 4: Segment Registers Configuration

After you invalidate the TLBs, you can setup the Segment Registers. The first 4 bits of a Effective Address chooses which Segment Register will be used. Therefore, by design, the following occurs..

Effective Address --> Segment Register Chosen
  • 0x0XXXXXXX --> sr0
  • 0x1XXXXXXX --> sr1
  • 0x2XXXXXXX --> sr2
  • .. ..
  • 0xEXXXXXXX --> sr14
  • 0xFXXXXXXX --> sr15

Normally, a coder/dev may write a series of cmpwi/branch instructions to take an input Effective Address and know which SR to configure. There's no need for that. Broadway comes with the mtsrin instruction

Move to Segment Register Indirect~
mtsrin rS, rB
Upper 4 bits of rB selects the SR
rS is copied into the SR

Here's an example code that setups every SR with all protection bits low (no restrictions on supervisor, user, or execute). It includes a lookup table where all 16 randomly generated VSID's reside at

Code:
#Use a VSID lookup table
bl vsid_table
#Example VSID's listed in table. Ofc use your own, that are randomly generated.
.long 0x000
.long 0x111
.long 0x222
.long 0x333
.long 0x444
.long 0x555
.long 0x666
.long 0x777
.long 0x888
.long 0x999
.long 0xaaa
.long 0xbbb
.long 0xccc
.long 0xddd
.long 0xeee
.long 0xfff
vsid_table:
mflr r3

#Set loop count. 16 for 16 SR's
li r0, 16
mtctr r0

#Pre decrement for loop
addi r3, r3, -4

#Set r4 to 0x00000000, increment by 0x10000000 to select next SR for the mtsrin instruction
li r4, 0

#Loop. Write SR using mtsrin
write_sr_loop:
lwzu r0, 0x4 (r3) #Load VSID
mtsrin r0, r4 #Based on r4's upper 4 bits, select the SR and write to it with currently loaded VSID
addis r4, r4, 0x1000 #Increment r4 to use next SR for mtsrin instruction
bdnz+ write_sr_loop



Chapter 5: Clearing the Page Table

Before any page table is to be used, it should always be entirely zero'd. Here's a snippet of code that does that..


Code:
#r3 = *Physical* Start Address of the Page Table
#r4 = Size of entire Page Table in bytes
#Divide size by 4
srwi r4, r4, 2

#Pre Decrement Start Address for Loop
subi r3, r3, 4

#Set r0, to 0
li r0, 0

#Set Loop Amount
mtctr r4

#Loop
zero_table:
stwu r0, 0x4 (r3)
bdnz+ zero_table

Above code is for real mode use. Assumes r3 is physical. Please NOTE that you could use a double-float store mechanism or the dcbz instruction to clear the page table. Most general PPC applications use regular integer stores because page tables are set up very early in the program, and floats+cache may have not been enabled/configured yet. 



Chapter 6: Algorithm, How PTEGs are Generated

In order for any Page Table to be constructed for use, it needs to be filled with PTEs at the correct spots within the Table. This is determined by an algorithm. This algorithm requires 2 inputs. The EA and what's in SDR1.

First, here is a very broad overview of how an Effective Address is translated to its Physical Equivalent

[Image: pagetable02.png]

As you can see portions of the EA are broken up. Then the selected SR is utilized with the EA portions to make a temporary 52-bit Address. The VPN portion (upper 40 bits) of this 52-bit Address then goes through a series of operations and hashing. Here's a diagram to display that...

[Image: pagetable03.png]

The above chart can be broken down into the following steps~

  1. Lower 12-bits of the 52-bit address is placed aside. VPN is broken up into the VSID and Page-Index. These two items are used as the inputs for the Hash function
  2. The output of this Hash function is 19 bits in size. (upper 13 bits always result in null)
  3. HTABMASK of SDR1 is logically AND'd with upper 9 bits of Hash1
  4. Result from step 3 is logically OR'd with HTABORG's Maskable Bits
  5. Upper 7 bits of PTEG is the upper 7 bits of HTABORG
  6. Next 9 bits of PTEG is the result of Step 4
  7. Next 10 bits of PTEG is the lower 10 bits of Hash1 (step 2 result)
  8. Lower 6 bits of PTEG is always set low (64-byte aligned)

The following chart demonstrates how you can hand-generate a PTEG using just the EA and SDR1. The chart uses the following inputs...

SDR1 = 0x0F980007
Virtual Addr/EA = 0x00FFA01B

[Image: pagetable04.png]

As you can see the PTEG result is 0x0F9FF980. It's important to understand that the amount of "1" bits in HTABMASK in SDR1 determines how many bits of Hash Value 1 is to be placed into PTEG bit 15 going leftward. The chart indicates this via the bracketed bit contents of the upper 9 bits of Hash Value 1 which in turn points to the bracketed bits in the PTEG.

The above chart showed how a Primary PTEG is generated. Sometimes (due to the result of the Hash Value 1) a PTEG can be generated which matches a previous PTEG from a different EA. If such a case occurs, Hash Value 1 must be logically NOT'd (bitwise negated or also known as a 1's complement).

This new Hash Value is known as Hash Value 2. In the above chart, it would replace what's in Hash Value 1 (steps beforehand aren't required anymore). The following Chart shows what occurs once Hash Value 2 needs to be used...

[Image: pagetable05.png]

Therefore if Primary PTEG couldn't be used, the new (Secondary) PTEG would be 0x0F980640.



Chapter 7: Constructing the Page Table

In order to construct a Page Table, you must write all PTEs for all possible PTEGs for your range of Covered Memory.

Summary of constructing part (or all) of the Page Table based on a single EA. Assumes you also have the SDR1 and the PA that you want to use for the EA translation.

1. Using EA, figure out which SR would be used
2. Grab SR data
3. Form Upper 32-bits of PTE by...
    a. Extract VSID & API from SR
    b. Form temp upper PTE by inserting both VSID and API
    c. Finalize it by flipping bit 0 high (V/Valid bit)
4. Form Lower 32-bits of PTE by...
    a. Supply the PA (alternatively, you can supply one for identical translation by extracting the RPN bits from the EA)
    b. Insert desired WIMG bits
    c. Insert a high R bit, a high C bit, and desired PP bits
5. Generate PPC-Special Hash Value aka Hash Value 1
6. Generate PTEG Address by....
    a. Create a temp hash called tmp1 using Hash Value 1 & SDR1
    b. Create another temp hash called tmp2 using tmp1 and SDR1
    c. Create temp blank PTEG
    d. Insert blank PTEG, tmp2, & Hash Value 1
7. Using PTEG Address from Step 6, make sure there is a empty (invalid) PTE (out of 8)
8. If empty, write new PTE (will set it valid) that was formed from steps 3 and 4
9. If all 8 PTEs of PTEG are already valid, run a secondary special hash (Hash Value 2) to generate a different PTEG
10. Check 8 PTEs in Second PTEG, if none of those can be used, then halt
11. If one of the PTEs in the 2nd PTEG can be used, write new PTE but with H bit high to indicate Hash Value 2 was required

The above must be done for every 4KB aligned virtual address that you plan to use. So for example, let's say you want to setup the following translation scheme...

Effective/Virtual Address Range | Physical Address Range
0xA0000000 thru 0xA07FFFFF | 0x00000000 thru 0x007FFFFF

The above would be for 8MB of covered memory. To construct all the PTEs, you would first need to construct the PTE for virtual address 0xA0000000, then 0xA0001000, then 0xA0002000, etc etc until the last address of 0xA07FF000. When constructing the PTEs be sure the correct physical address is used for each new 4KB aligned virtual address you are utilizing.

Example snippet of code for a single PTE construction~
Assumes all SR's are configured, TLB's invalidated, SDR1 configured, and you are in real mode with ID+DR low.

Code:
#r3 = Virtual/Effective EA (assumed to be 4KB aligned)
#r4 = SDR1
#r5 = Physical EA equivalent desired (assumed to be 4KB aligned)

#Figure out which SR (r6) to use
#We could use a list of cmpwi/beq's by grabbing the data from every individual SR, but......
#...Broadway has the mfsrin instruction HOORAY!
#mfsrin rD, rB
#The upper 4 bits in rB selects the SR to use in the instruction
#SR is then copied into rD
mfsrin r6, r3

#SR found. Create Upper PTE except H bit (r0)
#Use VSID from SR, API from EA. Flip V high.
rlwinm r0, r6, 7, 1, 24 #Extract VSID from Segment Register
rlwimi r0, r3, 10, 26, 31 #Extract API from Virtual EA and insert VSID+API
oris r0, r0, 0x8000 #Set valid (V)) bit

#Create Lower PTE (r7)
#Use RPN bits from desired Physical Address
#Use desired WIMG and PP bits
#Set R and C high
#!!NOTE!! Example here uses WIMG of 0000 and PP of 10
clrrwi r7, r5, 12 #Extract RPN bits from desired PA
li r8, 0 #Set WIMG
rlwimi r7, r8, 3, 25, 28 #Insert WIMG
ori r7, r7, 0x0182 #R high, C high, PP set to 0x10; change accordingly to your needs

#Generate 19-bit Hash Value 1 (r8)
rlwinm r8, r3, 20, 16, 31 #Extract EA bits 4 thru 19, right justify it
clrlwi r9, r6, 13  #Extract the lower 19 bits (bits 13 thru 31 of the SR) of the VSID
xor r8, r8, r9

#Preset r12 to hold absent H bit
li r12, 0

#Calculate PTEG Address (r11)
#tmp1 = Hash Value[13-21] & HTAMASK
#tmp2 = tmp1 | HTABORG-Maskable
#PTEG = SDR1[0-6], tmp2 [rol'd 16], Hash Value[22-31 rol'd 6]
calc_pteg:
rlwinm r9, r8, 22, 23, 31 #Extract bits 13 thru 21 of the Hash Value & right justify it
and r10, r9, r4 #Create tmp1 Hash , Logically AND r9 with HTABMASK (note there's no need to extract HTABMASK out of SDR1 beforehand because the ANDing will never be effected by the HTABORG bits
rlwinm r11, r4, 16, 23, 31 #Extract "Maskable" bits of HTABORG of SDR1 & right justify it
or r10, r10, r11 #Create tmp2 Hash; tmp1 | HTABORG-Maskable
li r11, 0 #Set r11 to 0 for a fresh PTEG Address
rlwimi r11, r4, 0, 0, 6 #Insert SDR1 bits 0 thru 6 (HTABORG non-maskable) into  upper 7 bits of PTEG
rlwimi r11, r10, 16, 7, 15 #Insert tmp2 into PTEG bits 7 thru 15
rlwimi r11, r8, 6, 16, 25 #Insert Hash Value bits 22 thru 31 into PTEG bits 16 thru 25

#Set H bit in upper PTE
or r0, r0, r12 #Logically OR in possible H-bit into upper PTE

#Check if at least 1 out of 8 PTEs are not already in use. First non-valid one will be constructed
li r10, 8 #r10 safe to use now
subi r11, r11, 8 #Pre decrement for loop
mtctr r10
pte_valid_check:
lwzu r10, 0x8 (r11)
andis. r9, r10, 0x8000 #r9 safe to use now
beq- construct_pte
bdnz+ pte_valid_check

#Check if we are on Hash Value 1 or 2
cmpwi r12, 0x0040
beq- ERROR #If equal we already used both hashes!

#Hash Value 2 not used yet. Hash Value 2 is simply a 1's complement of P-Hash. All we need is a bitwise-negate or better known as a Logical-NOT
not r8, r8
li r12, 0x0040 #Set r12 to have H bit high next time we run calc_pteg
b calc_pteg #Re-do PTEG address calculation

#ERROR; currently configured to do a basic infinite loop, adjust this to your needs
ERROR:
nop
b ERROR

#We can construct the PTE. Do it!
construct_pte:
stw r0, 0 (r11)
stw r7, 0x4 (r11)

Already, well that was a doozy. As you can see in the above source code, the mfsrin instruction was used to know which SR data to grab based on the EA. This is much more efficient that using a list of compare+branch instructions.

Move from Segment Register Indirect~
mfsrin rD, rB
Upper 4 bits of rB selects the SR 
SR is copied into rD



Chapter 8: Wrapping Things Up; Example Gecko Code

When exiting real mode, be sure that IR and DR will be set high in the MSR after the rfi instruction has been executed. Also make sure EE is back to its original state.

Here is a Gecko Code that uses a Page Table for 0xA0000000 thru 0xA07XXXXX (physical 0x00000000 thru 0x007XXXXX) translation. Once the Page Table has been fully constructed, a simple store instruction using the address of 0xA0001500 is completed. Obviously this works or else an exception (page fault) would occur.

-----

0xA0000000+ 8MB Page Table Example [Vega]

PAL
C200A42C 00000032
9421FFE0 BF810008
3D808022 618C9814
7D8803A6 3C600001
3C800001 80ADA360
80A50024 4E800021
38804000 38A3FFFC
38000000 7C8903A6
94050004 4200FFFC
48000005 7FE802A6
3BDF0024 57DE007E
7FDA03A6 7FC000A6
57C0045E 54000732
7C1B03A6 4C000064
6C638000 5464843E
38A4FFFF 7CA52078
7CA50034 20A50020
7C642B78 7C0004AC
7C9903A6 4C00012C
38000040 38600000
7C0903A6 7C001A64
38631000 4200FFF8
7C00046C 3CC000CA
60C6701C 7CCA01A4
3FA0A000 3B800800
7FA3EB78 546500FE
54C03870 506056BE
64008000 54A70026
39000000 51071E78
60E70182 5468A43E
54C9037E 7D084A78
39800000 5509B5FE
7D2A2038 548B85FE
7D4A5B78 39600000
508B000C 514B81DE
510B3432 7C006378
39400008 396BFFF8
7D4903A6 854B0008
75498000 41820024
4200FFF4 2C0C0040
41820010 7D0840F8
39800040 4BFFFFB0
60000000 4BFFFFFC
900B0000 90EB0004
379CFFFF 3BBD1000
4082FF60 7FDB03A6
3BFF0130 7FFA03A6
4C000064 38000007
3C60A000 90031500
BB810008 38210020
38600000 00000000

Code:
#Address Ports
#PAL = 8000A42C

#Assembler Directives
.set egg_alloc, 0x80229814
.set page_table_size_bytes, 0x00010000
.set page_table_size_words, 0x4000
.set VSID, 0x00CA701C

#Inline Style Stack Frame to backup 4 Registers
#No need to save LR or r0
stwu sp, -0x0020 (sp)
stmw r28, 0x8 (sp)

#Call Egg Alloc
#Using 8MB of Covered Memory. Page Table will be 0x00010000 bytes (64Kbytes)
lis r12, egg_alloc@h
ori r12, r12, egg_alloc@l
mtlr r12
lis r3, page_table_size_bytes@h
lis r4, 0x0001 #LOL it works
lwz r5, -0x5CA0 (r13) #PAL specific for Egg-Alloc
lwz r5, 0x0024 (r5)
blrl

#Clear Table just in case allocated memory has junk in it
#r3 = Start Address of the Page Table
#r4 = Size of entire Page Table in words
li r4, page_table_size_words

#Pre Decrement Start Address for Loop
subi r5, r3, 4

#Set r0, to 0
li r0, 0

#Set Loop Amount
mtctr r4

#Loop
zero_table:
stwu r0, 0x4 (r5)
bdnz+ zero_table

#Go into Real Mode with EE, IR, and DR set low
bl get_pc #Get Program counter
get_pc:
mflr r31 #Will need this value later to get back to Virtual Mode

margin = real_mode - get_pc

addi r30, r31, margin #This points to instruction right after rfi, keep r31 intact for later
clrlwi r30, r30, 1 #Change address to physical
mtspr srr0, r30 #Place physical address into srr0
mfmsr r30 #Get MSR, keep it in r30 for later
rlwinm r0, r30, 0, 17, 15 #Flip EE low
rlwinm r0, r0, 0, 28, 25 #Flip IR, DR low
mtspr srr1, r0 #Place updated MSR into srr1
rfi #Go into real mode

#Setup SDR1; use 64KB aligned address (r3) returned from Egg_Alloc
#Address could be beyond 64KB aligned, we'll need to count trailing zeroes in the upper 16 bits to be sure
real_mode:
xoris r3, r3, 0x8000 #Make page table root address physical
srwi r4, r3, 16 #Temp shift to the right by 16 bits

#Count trailing zeros
subi r5, r4, 1
andc r5, r5, r4
cntlzw r5, r5
subfic r5, r5, 32

#Or in Physical Heap Address with Trailing Zero value
or r4, r3, r5

#Write it to SDR1
sync #Required per page 4-43 table 2-23 of the PPC PEM Book
mtspr 25, r4 #SDR1's SPR number is 25
isync #Required per page 4-43 table 2-23 of the PPC PEM Book

#Invalidate TLBs
li r0, 64
li r3, 0
mtctr r0
inval_tlb:
tlbie r3
addi r3, r3, 0x1000
bdnz+ inval_tlb
tlbsync #Required per page 202 section 5.4.3.2 of the Broadway Manual

#Setup Segment Register 10 for 0xA0000XXX Virtual Memory
#Set VSID, keep all other bits low. SR will simply be just the VSID then
lis r6, VSID@h
ori r6, r6, VSID@l #Using r6 due upcoming code that is designed to having SR in r6
mtsr 10, r6

#Set Initial EA in r29
lis r29, 0xA000

#Set PTE Construction Mega Loop Amount
#(0xA07FF - 0xA0000) + 1 = 0x800 amount of 4KB aligned addresses for PTE construction
li r28, 0x0800

#LOOP
mega_loop:
mr r3, r29
clrlwi r5, r3, 3 #Change 0xA to 0x0

#r3 = Virtual/Effective EA (assumed to be 4KB aligned)
#r4 = SDR1
#r5 = Physical EA equivalent desired (assumed to be 4KB aligned)
#r6 = PA (assumed to be 4KB aligned)

#Use VSID from SR, API from EA. Flip V high.
rlwinm r0, r6, 7, 1, 24 #Extract VSID from Segment Register
rlwimi r0, r3, 10, 26, 31 #Extract API from Virtual EA and insert VSID+API
oris r0, r0, 0x8000 #Set valid (V)) bit

#Create Lower PTE (r7)
#Use RPN bits from desired Physical Address
#Use desired WIMG and PP bits
#Set R and C high
#!!NOTE!! Example here uses WIMG of 0000 and PP of 10
clrrwi r7, r5, 12 #Extract RPN bits from desired PA
li r8, 0 #Set WIMG
rlwimi r7, r8, 3, 25, 28 #Insert WIMG
ori r7, r7, 0x0182 #R high, C high, PP set to 0x10; change accordingly to your needs

#Generate 19-bit Hash Value 1 (r8)
rlwinm r8, r3, 20, 16, 31 #Extract EA bits 4 thru 19, right justify it
clrlwi r9, r6, 13  #Extract the lower 19 bits (bits 13 thru 31 of the SR) of the VSID
xor r8, r8, r9

#Preset r12 to hold absent H bit
li r12, 0

#Calculate PTEG Address (r11)
#tmp1 = Hash Value[13-21] & HTAMASK
#tmp2 = tmp1 | HTABORG-Maskable
#PTEG = SDR1[0-6], tmp2 [rol'd 16], Hash Value[22-31 rol'd 6]
calc_pteg:
rlwinm r9, r8, 22, 23, 31 #Extract bits 13 thru 21 of the Hash Value & right justify it
and r10, r9, r4 #Create tmp1 Hash , Logically AND r9 with HTABMASK (note there's no need to extract HTABMASK out of SDR1 beforehand because the ANDing will never be effected by the HTABORG bits
rlwinm r11, r4, 16, 23, 31 #Extract "Maskable" bits of HTABORG of SDR1 & right justify it
or r10, r10, r11 #Create tmp2 Hash; tmp1 | HTABORG-Maskable
li r11, 0 #Set r11 to 0 for a fresh PTEG Address
rlwimi r11, r4, 0, 0, 6 #Insert SDR1 bits 0 thru 6 (HTABORG non-maskable) into  upper 7 bits of PTEG
rlwimi r11, r10, 16, 7, 15 #Insert tmp2 into PTEG bits 7 thru 15
rlwimi r11, r8, 6, 16, 25 #Insert Hash Value bits 22 thru 31 into PTEG bits 16 thru 25

#Set H bit in upper PTE
or r0, r0, r12 #Logically OR in possible H-bit into upper PTE

#Check if at least 1 out of 8 PTEs are not already in use. First non-valid one will be constructed
li r10, 8 #r10 safe to use now
subi r11, r11, 8 #Pre decrement for loop
mtctr r10
pte_valid_check:
lwzu r10, 0x8 (r11)
andis. r9, r10, 0x8000 #r9 safe to use now
beq- construct_pte
bdnz+ pte_valid_check

#Check if we are on Hash Value 1 or 2
cmpwi r12, 0x0040
beq- ERROR #If equal we already used both hashes!

#Hash Value 2 not used yet. Hash Value 2 is simply a 1's complement of P-Hash. All we need is a bitwise-negate or better known as a Logical-NOT
not r8, r8
li r12, 0x0040 #Set r12 to have H bit high next time we run calc_pteg
b calc_pteg #Re-do PTEG address calculation

#ERROR; currently configured to do a basic infinite loop, adjust this to your needs
ERROR:
nop
b ERROR

#We can construct the PTE. Do it!
construct_pte:
stw r0, 0 (r11)
stw r7, 0x4 (r11)

#Decrement Mega Loop, update r29
subic. r28, r28, 1
addi r29, r29, 0x1000
bne+ mega_loop

#Leave Real Mode
virt_margin = virtual_mode - get_pc

#Restore very first original MSR (r30)
mtspr srr1, r30

#Original get_pc value still in r31, simply add virt_margin to it
addi r31, r31, virt_margin
mtspr srr0, r31
rfi

#Test the page table!
virtual_mode:
li r0, 7
lis r3, 0xA000
stw r0, 0x1500 (r3)

#Pop Inline Style Frame
lmw r28, 0x8 (sp)
addi sp, sp, 0x0020

#Original Instruction
li r3, 0



Chapter 9: Credits, Resources
  • NXP AN2791 PDF (Diagrams, Chapter 7 source with slight modifications)
  • PowerPC Microprocessor Family: The Programming Environments PDF (Diagrams, SR breakdown, SDR1 real mode and syncing rules)
  • Broadway User Manual (tlb invalidation and syncing)
  • PowerPC Compiler Writer's Guide (count trailing zero's source used in Gecko Code)

Print this item

  Allow invalid ghosts [jawa]
Posted by: jawa - 11-29-2022, 02:53 PM - Forum: Time Trials & Battle - No Replies

NTSC-U:
04517DAC 38600001

PAL:
0451C220 38600001

NTSC-J:
0451BBA0 38600001

NTSC-K:
0450A240 38600001

Source:
write "li r3, 1"

Print this item

  Ultimate Stockpile Items [Deez Nuts, Luis, Ro]
Posted by: Deez Nutz - 11-28-2022, 02:43 PM - Forum: Incomplete & Outdated Codes - No Replies

Ultimate Stockpile Items NTSC U

2834XXXX YYYYZZZZ ? Button activator/deactivator
047964A4 48000050
04796790 60000420
CC000000 00000000
047964A4 48003101
04796790 60000020
E0000000 80008000


Ultimate Stockpile Items PAL (untested)
2834XXXX YYYYZZZZ ? Button activator/deactivator
0479F4B0 48000050
0479F79C 60000420
CC000000 00000000
0479F4B0 48003101
0479F79C 60000020
E0000000 80008000


Ultimate Stockpile Items NTSC J (untested)
2834XXXX YYYYZZZZ ? Button activator/deactivator
0479EB1C 48000050
0479EE08 60000420
CC000000 00000000
0479EB1C 48003101
0479EE08 60000020
E0000000 80008000


This is a remake of Luis's private code compiled with Ro's Miniature Items and a freeze items code by Luis himself,
this code allows you to drop frozen mini items when activated some items still keep there effects when hit and some don't.

Print this item

  Hey all!
Posted by: vabold - 11-20-2022, 12:14 PM - Forum: Introductions - Replies (1)

I'm realizing that I never made a proper introduction thread to the site, so I figure I'll do that now.

Hi there! My name is Aiden, and I go online by vabold. I'm a college student aiming to major in CS, and I occasionally create cheat codes. My ultimate goals are to help new people get better at modifying Mario Kart Wii by providing resources and opportunities for them. I initially entered the code-making scene with no knowledge of C/C++ or PowerPC assembly, and stebler was very kind to teach me the basics of cheat code creation.

My first notable project was the now-abandoned character extension code. stebler guided me through the very basics of PowerPC and using Ghidra to understand the game's classes. The code works as intended, but with some caveats (notably, the minimap and reused character sounds). You can find a demonstration video with the last progress I made here: https://www.youtube.com/watch?v=cpIk-u_OGPE

Nowadays, I have two major focuses: helping implement client/server netcode into MKW-SP, and decompiling the physics engine for machine learning teams. You can find current progress on the former here: https://www.youtube.com/watch?v=qSby3NLISkc

Your best bet to contact me is via Discord, vabold#7613.

Print this item

  Remove invisible walls [jawa]
Posted by: jawa - 11-19-2022, 07:37 PM - Forum: Visual & Sound Effects - Replies (5)

Removes invisible walls.
WARNING! Halfpipes do not work as intended.

NTSC-U
C251508C 0000001C
7D2000A6 552C045E
7D800124 38210120
9421FF60 BDC10008
7E4802A6 2C030000
418200A4 48000011
636F7572 73652E6B
636C0000 7E2802A6
7F95E378 3A600000
7E138A14 89F00000
7E13AA14 89D00000
7E8F7214 2C140000
41820018 7C0F7000
41820008 48000060
3A730001 4BFFFFD4
7C701B78 82300008
82B0000C 7E318214
7EB58214 7E348B78
7C14A800 41800008
48000034 A274000E
39E0001F 7E737838
2C13001F 41810020
2C13000D 41820008
4800000C 39E00018
91F4000E 3A940010
4BFFFFC8 7E4803A6
B9C10008 382100A0
7D8000A6 512C0420
7D800124 00000000

PAL
C2519500 0000001C
7D2000A6 552C045E
7D800124 38210120
9421FF60 BDC10008
7E4802A6 2C030000
418200A4 48000011
636F7572 73652E6B
636C0000 7E2802A6
7F95E378 3A600000
7E138A14 89F00000
7E13AA14 89D00000
7E8F7214 2C140000
41820018 7C0F7000
41820008 48000060
3A730001 4BFFFFD4
7C701B78 82300008
82B0000C 7E318214
7EB58214 7E348B78
7C14A800 41800008
48000034 A274000E
39E0001F 7E737838
2C13001F 41810020
2C13000D 41820008
4800000C 39E00018
91F4000E 3A940010
4BFFFFC8 7E4803A6
B9C10008 382100A0
7D8000A6 512C0420
7D800124 00000000

NTSC-J
C2518E80 0000001C
7D2000A6 552C045E
7D800124 38210120
9421FF60 BDC10008
7E4802A6 2C030000
418200A4 48000011
636F7572 73652E6B
636C0000 7E2802A6
7F95E378 3A600000
7E138A14 89F00000
7E13AA14 89D00000
7E8F7214 2C140000
41820018 7C0F7000
41820008 48000060
3A730001 4BFFFFD4
7C701B78 82300008
82B0000C 7E318214
7EB58214 7E348B78
7C14A800 41800008
48000034 A274000E
39E0001F 7E737838
2C13001F 41810020
2C13000D 41820008
4800000C 39E00018
91F4000E 3A940010
4BFFFFC8 7E4803A6
B9C10008 382100A0
7D8000A6 512C0420
7D800124 00000000

NTSC-K
C2507520 0000001C
7D2000A6 552C045E
7D800124 38210120
9421FF60 BDC10008
7E4802A6 2C030000
418200A4 48000011
636F7572 73652E6B
636C0000 7E2802A6
7F95E378 3A600000
7E138A14 89F00000
7E13AA14 89D00000
7E8F7214 2C140000
41820018 7C0F7000
41820008 48000060
3A730001 4BFFFFD4
7C701B78 82300008
82B0000C 7E318214
7EB58214 7E348B78
7C14A800 41800008
48000034 A274000E
39E0001F 7E737838
2C13001F 41810020
2C13000D 41820008
4800000C 39E00018
91F4000E 3A940010
4BFFFFC8 7E4803A6
B9C10008 382100A0
7D8000A6 512C0420
7D800124 00000000


Code:
# System::DVDArchive::getFile([DVDArchive* d_arc], char const*, unsigned int*)
# returns file buffer (r3 = void* outbuf)
.set INVISIBLE_WALL, 0x0D
.set SOUND_TRIGGER, 0x18
.set STACK, 0xA0

.macro disable_interrupts
    mfmsr r9
    rlwinm r12, r9, 0, 17, 15
    mtmsr r12
.endm
.macro enable_interrupts
    mfmsr r12
    rlwimi r12, r9, 0, 16, 16
    mtmsr r12
    .endm
disable_interrupts

#default
addi sp, sp, 288
# push stack
stwu sp, -STACK (sp)
stmw r14, 0x8 (sp)

mflr r18
cmpwi r3, 0
beq end

check:

    bl course_kcl_string
        .string "course.kcl"
        .align 2
    course_kcl_string:
    mflr r17

    course_kcl:
        # r17 = &"course.kcl", string 1
        mr r21, r28 # r21 = filename, string 2
        # li r20, 0 # r21 = strings_are_equal
        li r19, 0 # r19 = strcmp counter
        loop:
            add r16, r19, r17
            lbz r15, 0 (r16) # str1 + off
            add r16, r19, r21
            lbz r14, 0 (r16) # str2 + off
            add r20, r15, r14
            cmpwi r20, 0 # if chr == \0
            beq success
            cmpw r15, r14 # (str1 + off) == (str2 + off)
            beq cont_add
            b end
        cont_add:
            addi r19, r19, 1
            b loop
        success:
            # course.kcl is loaded!
            # r3 = void* buf#
            mr r16, r3
            lwz r17, 0x08 (r16) # r17 = SEC3.start
            lwz r21, 0x0C (r16) # r21 = SEC4.start
            add r17, r17, r16 # SEC3* sec3 = file_start + SEC4.start
            add r21, r21, r16 # SEC4* sec4 = file_start + SEC3.start
            mr r20, r17 # r20 = i = SEC3.start
        loop2:
            cmpw r20, r21 # i, SEC4
            blt loop2_2
            b end
        loop2_2:
            lhz r19, 0xE (r20) # kcl flag
            li r15, 31 # and mask value
            and r19, r19, r15 # kcl_type = kcl_flag & mask [0x10] (isolate 5 LSB bits)
            cmpwi r19, 0x1F
            bgt- end
            cmpwi r19, INVISIBLE_WALL
            beq invis
            b inc
        invis:
            li r15, SOUND_TRIGGER
            stw r15, 0xE (r20) # *kcl_flag_ptr = SOUND_TRIGGER
        inc:
            addi r20, r20, 0x10 # i += 0x10
            b loop2

end:
    mtlr r18
    lmw r14, 0x8 (sp)
    # pop stack
    addi sp, sp, STACK
    enable_interrupts

Print this item

  Fast Race Music Modifier [Zeraora]
Posted by: Zeraora - 11-16-2022, 11:24 PM - Forum: Visual & Sound Effects - No Replies

This code allows the user to change all fast race music to any respective BRSTM. 

NTSC-U:
0470A5D0 388000XX

PAL:
04712074 388000XX

NTSC-J:
047116E0 388000XX

NTSC-K:
0470041C 388000XX

XX = BRSTM Identifier

A list of the identifiers can be found on the wiiki.

Print this item

  Normal Race Music Modifier [Zeraora]
Posted by: Zeraora - 11-16-2022, 11:22 PM - Forum: Visual & Sound Effects - No Replies

This code allows the user to change all normal race music to any respective BRSTM. 

NTSC-U:
0470A53C 388000XX

PAL:
04711FE0 388000XX

NTSC-J:
0471164C 388000XX

NTSC-K:
04700388 388000XX

XX = BRSTM Identifier

A list of the identifiers can be found on the wiiki.

Print this item

  QEMU + GNU Debugger Basic Tutorial
Posted by: Vega - 11-14-2022, 01:24 AM - Forum: Other - No Replies

QEMU + GNU Debugger Basic Tutorial

Editor's NOTE: I'm very new to QEMU, GDB, and general assembling/linking. So if anybody has any improvements or corrections for this tutorial, please share them. Thank you.



Chapter 1: Intro

NOTE: Guide is for Linux only. Verified to work on Debian 10 & Debian 11.

Instead of using something like Dolphin to rig up an environment to test simulate PowerPC code/instructions, you can instead use the QEMU emulator with the GNU Debugger.

The QEMU and GNU programs support a wide variety of languages, thus you can use those programs for the following...
  • ARM 64-bit aka AAarch64 (for Nintendo Switch)
  • ARM 32-bit (for Starlet on Nintendo Wii)
  • PPC 64-bit (for Xbox360 & PS3)
  • PPC 32-bit (for Nintendo Gamecube, Broadway on Wii, & Wii U)

Keep in mind that the languages are "generic" for the most part. The Xbox360, PS3, Wii U, Wii, and Gamecube all use unique CPUs that have special/additional instructions & registers to their conventional counterparts. The Switch uses a "generic" Cortex-A57 CPU which uses the ARMv8-a (8.0) language.

The GNU Debugger can specify some CPUs. For example, you can specify PowerPC 32-bit CPU 750cl to try to mimic as closely as possible for Broadway. Regarding Nintendo Switch, you can specify its exact CPU (cortex-a57).

The great thing about QEMU+GDB is that you can test C code, not just basic Assembly files. The following guide will cover debugging a basic Hello World source written in C. Later on, a quick overview of debugging bare-bones Assembly files will also be covered.



Chapter 2: Software Installation

Update & Upgrade your System then Reboot

Code:
sudo apt-get update
sudo apt-get upgrade
sudo reboot


Install the GNU Compiler Software for the desired architecture

ARM 64 bit...
Code:
sudo apt-get install gcc-aarch64-linux-gnu binutils-aarch64-linux-gnu binutils-aarch64-linux-gnu-dbg 


ARM 32 bit...
Code:
sudo apt-get install gcc-arm-linux-gnueabihf binutils-arm-linux-gnueabihf binutils-arm-linux-gnueabihf-dbg


PPC 64 bit...
Code:
sudo apt-get install gcc-powerpc64-linux-gnu binutils-powerpc64-linux-gnu binutils-powerpc64-linux-gnu-dbg


PPC 32 bit...
Code:
sudo apt-get install gcc-powerpc-linux-gnu binutils-powerpc-linux-gnu binutils-powerpc-linux-gnu-dbg


Install QEMU Emulator & GNU Debugger
Code:
sudo apt-get install qemu-user qemu-user-static gdb-multiarch build-essential


NOTE: There are other qemu packages such as qemu and qemu-system, we only need user and user-static.



Chapter 3: C file creation and compilation

Create the following Hello World C file. Save it as hello_world.c

Code:
#include <stdio.h>
#include <stdlib.h>

int main() {
    puts("hello world");
    return EXIT_SUCCESS;
}

Compile an executable file from your C source

ARM64:
Code:
aarch64-linux-gnu-gcc -ggdb3 -o hello_world hello_world.c -static -mcpu=cortex-a57


ARM32:
Code:
arm-linux-gnueabihf-gcc -ggdb3 -o hello_world hello_world.c -static


NOTE: The tags of... "-mbig-endian -march=armv5te -mcpu=arm926ej-s" should be included, but I can't get the file to be compiled when these tags are applied. So if anyone is very familiar with GCC, please let me know how to remedy this.

PPC64:
Code:
powerpc64-linux-gnu-gcc -ggdb3 -o hello_world hello_world.c -static


PPC32:
Code:
powerpc-linux-gnu-gcc -ggdb3 -o hello_world hello_world.c -static -mcpu=750


NOTE: PPC64 and PPC32 default to big endian. Extra command tags for endianness are not required.

About command tags:
-o = Create object file (executable)
-ggdb3 = Use GNU Debugging symbols
-static = Use static libraries



Chapter 4. Run C file

Launch the file on QEMU

ARM64:
Code:
qemu-aarch64 -L /usr/aarch64-linux-gnu -g 1234 ./hello_world


ARM32:
Code:
qemu-arm -L /usr/arm-linux-gnueabihf -g 1234 ./hello_world


PPC64:
Code:
qemu-ppc64 -L /usr/powerpc64-linux-gnu -g 1234 ./hello_world


PPC32:
Code:
qemu-ppc -L /usr/powerpc-linux-gnu -g 1234 ./hello_world


About command tags:
-L /user/xxxxx = Choose which elf interpreter to use
-g xxxxx = Set port number for GDB connection

QEMU & GDB need to run on a port. You can have multiple instances of QEMU+GDB programs running, but they cannot all use the same port.

At this moment you will notice that the terminal command to run QEMU is not doing anything...

[Image: gdb00.png]

This is exactly what you want to see. QEMU is waiting on the GNU Debugger to be launched. On a second terminal, launch the Debugger using the follow terminal command. Do NOT close/exit the first terminal!

ARM64:

Code:
gdb-multiarch -q --nh \
  -ex 'set architecture arm64' \
  -ex 'set sysroot /usr/aarch64-linux-gnu' \
  -ex 'file hello_world' \
  -ex 'target remote localhost:1234' \
  -ex 'break main' \
  -ex continue \
  -ex 'layout split' \
  -ex 'layout next' \ 
  -ex 'layout regs'

 
ARM32:

Code:
gdb-multiarch -q --nh \
  -ex 'set architecture arm' \
  -ex 'set sysroot /usr/arm-linux-gnueabihf' \
  -ex 'file hello_world' \
  -ex 'target remote localhost:1234' \
  -ex 'break main' \
  -ex continue \
  -ex 'layout split' \
  -ex 'layout next' \
  -ex 'layout regs'

 
PPC64:

Code:
gdb-multiarch -q --nh \
  -ex 'set architecture ppc64' \
  -ex 'set sysroot /usr/powerpc64-linux-gnu' \
  -ex 'file hello_world' \
  -ex 'target remote localhost:1234' \
  -ex 'break main' \
  -ex continue \
  -ex 'layout split' \
  -ex 'layout next' \
  -ex 'layout regs'

 
PPC32:

Code:
gdb-multiarch -q --nh \
  -ex 'set architecture ppc' \
  -ex 'set sysroot /usr/powerpc-linux-gnu' \
  -ex 'file hello_world' \
  -ex 'target remote localhost:1234' \
  -ex 'break main' \
  -ex continue \
  -ex 'layout split' \
  -ex 'layout next' \
  -ex 'layout regs'

 
About ex tags:
  • set architecture is self explanatory
  • set sysroot is for setting the directory that contains the targeted libraries, this must match what elf interpreter you used in the qemu command
  • file is self explanatory
  • target remote machine:portnumber is to tell gdb what machine and port QEMU is running on. Ofc port number used here must match what was used in the qemu terminal command
  • break main is to set a breakpoint on the main function
  • continue is to tell gdb to go ahead and run the program, do NOT breakpoint it at the very first assembly instruction
  • layout split will split the terminal into two halves where you can see more information simultaneously
  • layout regs tells gdb to place GPRs + some SPRs into the upper half of the split layout

Since break main and continue are applied, this tells GDB to run the program, and stop at the first instruction at the main function.
 
Notice how the port number in the GDB terminal command matches what was used in the QEMU terminal command.



Chapter 5: Basic GNU Debugging Commands, Stepping Thru the C File

At this point your C program is paused at 'main' waiting for further actions. Your GDB should look like this. Registers may or may not be available when you first boot GDB (we will address this shortly). For the picture below in my example, you will see our registers aren't available yet. You will also see we are at the start of our C program.

[Image: gdb01.png]

GDB comes with a large set of Debugging Commands, here's a quick list of useful ones~
 
GNU Debugging Commands:
  • step = step C program by 1 line (in Assembly view, this will step execution by 1 instruction)
  • stepi = step execution by 1 instruction (for Assembly view only)
  • nexti = step the very next instruction below, and bypass branches and function calls when encountered
  • break [function name] = set an instruction breakpoint at a function
  • step = step to next line in C program (for assembly files, this will do the same as stepi)
  • next = allow program to run til next instruction breakpoint
  • quit = quit
  • delete = delete all instruction breakpoints
  • info vector = list FPRs as Vector data first, then as Float data
  • layout next = swap to different view (C vs Assembly)
  • layout prev = swap to your previous view


Okay so we have GDB running, let's practice some commands. We can switch to Assembly view of our C file by using this command..

Code:
layout next


[Image: gdb02.png]

Great. Let's swap back to C view using this command...

Code:
layout prev


[Image: gdb03.png]

If registers aren't available upon boot, this can always be remedied via just 1 step (or 1 stepi when debugging a view of Assembly). The registers can now be seen. We can use the step command to step one line of C code, like this...

[Image: gdb04.png]

We can keep using the step command until you will see that the C source becomes unavailable. This is because our program has completed all execution. We can now use quit to exit GDB. Press Y when prompted.

[Image: gdb05.png]

When you have finally exited GDB, take a look at your terminal that was running QEMU. You will see that the QEMU process has been terminated.

[Image: gdb06.png]

NOTE: Be sure to properly exit GDB (via the quit command) or else you will need to use a different port next time you run QEMU.



Chapter 6: GDB Memory Commands

Before we dive into debugging an Assembly file, lets go over how to view memory on the GNU Debugger. Viewing memory is a bit complicated, you cannot (afaik) view memory live on a separate layout or terminal.

Memory command template:
x/nfu addr

x = all gdb memory commands must start with a lower case x.

n = The count of how many units to display. Default value is 1.

f = Display format. Default is x (for hex). d is for signed decimal. u is for unsigned decimal. o is for octal. f is for float.

u = unit type. b for bytes. h for halfwords. w for words. g for doublewords. Default value is w (words).

addr = memory address

If nfu all are set to default values (which would also be the case if they were all omitted), then a the slash (/) is NOT needed in the memory command.

Example memory command showing 4 hex words at address 0x1234C
x/4xw 0x1234C

Instead of having to fill in an address, you can instead reference a register using the "$" symbol.

Example memory command showing 2 hex words located at the Stack Pointer:
x/2xw $sp



Chapter 7: Create Executable file from Bare-bones Assembly

Instead of writing the file in C, we can use an Assembly file without any standard libraries. Delete your original hello_world executable, so we can make a new one via Assembly.

Choose the following assembly source that you want to use and save it as hello_world.s. Be sure the "s" is lowercase. The source will use the emulated computer's system calls (via QEMU) to printf a message.

ARM 64bit:
Code:
.section .text
    .global _start

_start:
/* syscall write(int fd, const void *buf, size_t count) */
    mov x0, #1    
    ldr x1, =msg
    ldr x2, =len
    mov w8, #64 /*Syscall number for write fo ARM64*/
    svc #0

/* syscall exit(int status) */
    mov x0, #0
    mov w8, #93 /*Syscall number for exit for ARM64*/
    svc #0

msg:
    .asciz "Hello, ARM64!\n"
    len = . - msg

ARM 32bit:
Code:
.section .text
    .global _start

_start:
/* syscall write(int fd, const void *buf, size_t count) */
    mov r0, #1
    ldr r1, =msg
    ldr r2, =len
    mov r7, #4 /*Syscall number for write for ARM32*/
    svc #0

/* syscall exit(int status) */
    mov r0, #0
    mov r7, #1 /*Syscall number for exit for ARM32*/
    svc #0

msg:
    .asciz "Hello, ARM32!\n"
    len = . - msg

PPC 64bit:
Code:
.section .text
    .global _start
.section ".opd","aw"
    .align 3
_start:
    .quad   ._start,.TOC.@tocbase,0
    .previous
    .global  ._start
._start:
/* syscall write(int fd, const void *buf, size_t count) */
    li 3, 1      
    lis 4, msg@highest
    ori 4,4, msg@higher
    rldicr 4, 4, 32, 31
    oris 4, 4, msg@h
    ori 4, 4, msg@l
    li 5, len
    li 0, 4 /*Syscall number for write for PPC64*/
    sc
/* syscall exit(int status) */
    li 3, 1
    li 0, 1 /*Syscall number for exit for PPC64*/
    sc
   
msg:
    .asciz "Hello, PPC64!\n"
    len = . - msg

PPC 32bit:
Code:
.section .text
    .global _start

_start:
/* syscall write(int fd, const void *buf, size_t count) */
    li 3, 1
    lis 4, msg@ha
    addi 4, 4, msg@l
    li 5, len
    li 0, 4 /*Syscall number for write for PPC32*/
    sc

/* syscall exit(int status) */
    li 3, 0
    li 0, 1 /*Syscall number for exit for PPC32*/
    sc

msg:
    .asciz "Hello, PPC32!\n"
    len = . - msg

--

Side note: View Chapter 9 for more info about syscalls

To assemble the source into an executable, the 2 following terminal commands are required...

ARM64:

Code:
aarch64-linux-gnu-as -mcpu=cortex-a57 hello_world.s -o hello_world.o
aarch64-linux-gnu-ld hello_world.o -o hello_world


ARM32:

Code:
arm-linux-gnueabihf-as -march=armv5te -mcpu=arm926ej-s -mbig-endian hello_world.s -o hello_world.o
arm-linux-gnueabihf-ld -EB hello_world.o -o hello_world


PPC64:

Code:
powerpc64-linux-gnu-as -mregnames hello_world.s -o hello_world.o
powerpc64-linux-gnu-ld hello_world.o -o hello_world


PPC32:

Code:
powerpc-linux-gnu-as -mregnames -m750cl hello_world.s -o hello_world.o
powerpc-linux-gnu-ld hello_world.o -o hello_world


NOTE: PPC64 and PPC32 default to big endian. Extra tags for endianness are not required.



Chapter 7: Launch file, Stepping Instructions

You will notice that the QEMU and GDB terminal commands have been tweaked since we are now using an Assembly file.

Launch QEMU~

ARM 64-bit:
Code:
qemu-aarch64 -g 1234 ./hello_world


ARM 32-bit:
Code:
qemu-arm -g 1234 ./hello_world


PPC 64-bit:
Code:
qemu-ppc64 -g 1234 ./hello_world


PPC 32-bit:
Code:
qemu-ppc -g 1234 ./hello_world


Launch GNU Debugger in a second terminal~

ARM64:

Code:
gdb-multiarch -q --nh \
  -ex 'set architecture aarch64' \
  -ex 'file hello_world' \
  -ex 'target remote localhost:1234' \
  -ex 'layout split' \
  -ex 'layout regs'


ARM32:

Code:
gdb-multiarch -q --nh \
  -ex 'set architecture arm' \
  -ex 'file hello_world' \
  -ex 'target remote localhost:1234' \
  -ex 'layout split' \
  -ex 'layout regs'


PPC64:

Code:
gdb-multiarch -q --nh \
  -ex 'set architecture ppc64' \
  -ex 'file hello_world' \
  -ex 'target remote localhost:1234' \
  -ex 'layout split' \
  -ex 'layout regs'


PPC32:

Code:
gdb-multiarch -q --nh \
  -ex 'set architecture ppc' \
  -ex 'file hello_world' \
  -ex 'target remote localhost:1234' \
  -ex 'layout split' \
  -ex 'layout regs'

 
Registers may or not be available at this moment.
 
At this point you can start instruction stepping (via the stepi command). Let's step just one instruction...

[Image: gdb07.png]

Fyi, when you step, any register(s) that are changed by the stepped instruction will become highlighted (except in this case of initially making the Registers available). Similar concept to how registers will change to a red font in the Dolphin Emulator.

Stepping is cool enough, but lets view some memory. Let's take a look at the first 4 word values (as hexadecimal) of the Stack. Use the following command...

Code:
x/4xw $sp


[Image: gdb08.png]

Sweet! Feel free to play around with this to get a better feel.

Please NOTE the GDB will not allow you to step thru the actual system calls themselves. When you type stepi on sc/svc you will be navigated to the two instructions ahead of said sc/svc. System calls are emulated and cannot be customized. More on syscalls in Chapter 9.



Chapter 8: Alternative Method for Assembly Files

Alternatively, you can do Assembly Files with just one terminal command. You will have to change the lowercase "s" in hello_world.s to be Capitalized (hello_world.S). However with this method, you are more limited on cpu and architecture specification.

ARM 64-bit:
Code:
aarch64-linux-gnu-gcc -ggdb3 -nostdlib -o hello_world -static hello_world.S


ARM 32-bit:
Code:
arm-linux-gnueabihf-gcc -ggdb3 -nostdlib -o hello_world -static hello_world.S


PPC 64-bit:
Code:
powerpc64-linux-gnu-gcc -ggdb3 -nostdlib -o hello_world -static hello_world.S


PPC 32-bit:
Code:
powerpc-linux-gnu-gcc -ggdb3 -nostdlib -o hello_world -static hello_world.S


-nostdlib = Do not include any libraries that are not entirely present in the source file(s).



Chapter 9: Syscall Tables

This chapter is present to address the use of syscalls in the barebones Assembly Source examples. Syscalls allow the user to handle tasks such as console input/output, memory allocation, and file management via bare bones assembly. QEMU cannot emulate custom syscalls. Every CPU Architecture has built-in syscalls with a unique syscall table. QEMU will emulate these. 

If you are familiar with tinkering with ISFS/IOS for Wii Files, then adapting to syscalls is very easy. They essentially operate the same (you use a file-open syscall to get an fd, and use the fd for future file based syscalls).

You can search around on Google and easily find the syscall table for your desired architecture.

I may expand this chapter/section in this future for a quick tut on using these syscalls with various examples.



Chapter 10: Final Note

You may get unknown errors when quitting GDB. This will usually occur if you quit after you have called the sc/svc for the exit status when stepping. Simply press Y to quit the session and press N to deny core file creation on GDB.

Print this item

  All Items Can Land V2.0 Lite [MrBean, CLF78]
Posted by: Deez Nutz - 11-13-2022, 11:41 PM - Forum: Incomplete & Outdated Codes - Replies (1)

This Is a lite version of the original all items can land by Mr Bean but will not freeze when used with future fly or stalking,
With this version dropped items such as megas/shocks/pow/bloopers will not work when dropped.

NTSC U
0479D6B8 60000000
0478DD24 38600000
04787EE4 39800001
04787EE8 39600001
04787EEC 39400001
04787EF0 39200001

PAL
047A66C4 60000000
0468000C 38600000
04790EF0 39800001
04790EF4 39600001
04790EF8 39400001
04790EFC 39200001


NTSC J
047A5D30 60000000
04680014 38600000
0479055C 39800001
04790560 39600001
04790564 39400001
04790568 39200001

Print this item

  No Boundary Check V2.0
Posted by: Deez Nutz - 11-13-2022, 11:30 PM - Forum: Code Support / Help / Requests - Replies (3)

Lets you drive anywhere out of the map without respawning, 
unlike the longer version by Anarion my 3 line version of this code does not cause the camera to glitch out of bounds when there are more then 2 players.

NTSC U
0056F033 00000000
C259728C 00000003
B27D0334 A01D0334

PAL
00573E83 00000000
C25A22C4 00000003
B27D0334 A01D0334

NTSC J
00573803 00000000
C25A1C44 00000003
B27D0334 A01D0334

Print this item