Welcome, Guest
You have to register before you can post on our site.

Username
  

Password
  





Search Forums

(Advanced Search)

Forum Statistics
» Members: 444
» Latest member: Phantom
» Forum threads: 1,500
» Forum posts: 8,963

Full Statistics

Online Users
There are currently 38 online users.
» 0 Member(s) | 36 Guest(s)
Bing, Google

Latest Threads
Make it to 10,000
Forum: General Discussion
Last Post: Cealgair
3 hours ago
» Replies: 3,240
» Views: 2,358,413
PowerPC Page Tables Tutor...
Forum: PowerPC Assembly
Last Post: Vega
12-01-2022, 08:47 PM
» Replies: 0
» Views: 74
Allow invalid ghosts [jaw...
Forum: Time Trials & Battle
Last Post: jawa
11-29-2022, 02:53 PM
» Replies: 0
» Views: 42
Ultimate Stockpile Items ...
Forum: Incomplete & Outdated Codes
Last Post: Deez Nutz
11-28-2022, 02:43 PM
» Replies: 0
» Views: 45
Boot into any menu [Melg]
Forum: Misc/Other
Last Post: Zeraora
11-26-2022, 10:34 PM
» Replies: 3
» Views: 1,404
Coding Questions and othe...
Forum: Code Support / Help / Requests
Last Post: Vega
11-25-2022, 04:54 PM
» Replies: 121
» Views: 44,579
Remove invisible walls [j...
Forum: Visual & Sound Effects
Last Post: jawa
11-21-2022, 02:53 PM
» Replies: 5
» Views: 186
Hey all!
Forum: Introductions
Last Post: Vega
11-20-2022, 02:31 PM
» Replies: 1
» Views: 398
Fast Race Music Modifier ...
Forum: Visual & Sound Effects
Last Post: Zeraora
11-16-2022, 11:24 PM
» Replies: 0
» Views: 89
Normal Race Music Modifie...
Forum: Visual & Sound Effects
Last Post: Zeraora
11-16-2022, 11:22 PM
» Replies: 0
» Views: 81

 
  All About BATs
Posted by: Vega - 05-23-2022, 02:11 PM - Forum: PowerPC Assembly - No Replies

All About BATs

For advanced ASM coders/devs.

This won't be useful for Wii Gecko Codes, but I've made this thread due to the Broadway Manual lacking detail on the BAT Registers and how to configure them. There are other PPC-based manuals that explain it, but said explanations are not all that "user-friendly".



Chapter 1: Intro

The BAT Registers are responsible for taking a Virtual Address and translating (converting) it to its Physical Address. As an Advanced ASM Coder/Dev, you should already know that mem1 and mem2 operate as Virtual Memory. Broadway needs to be 'told' how to convert any Virtual Memory Address to its Physical Address equivalent.

BAT stands for Block Address Translation. There are 16 total BAT registers. There are 8 BATs available for areas of memory containing instructions (known as IBATs), and 8 BATs available for memory containing data (known as DBATs). Each BAT is 64 bits (double-word) in length and is split into upper 32 bit and lower 32 bit portions.

Any area or region of memory that a BAT is responsible for is known as a Block. The BATs are mostly configured sometime shortly after the game is booted and a final BAT (DBAT3 for the L2 Cache) is configured after the bootstrap screen. From personal limited testing, it appears all Wii games setup the BATs in the same universal way.

Here's a picture of the BATs while having the emulation randomly paused in MKWii at the Main Menu

[Image: bat.png]

NOTE: Older Dolphin Dev Versions (such as 11xxx series and earlier) will not show proper values for the BATs! Use a newer version!

Here's a list of SPR numbers for all the BATs. The list is formatted to directly be placed into an ASM Code compiler (i.e. PyiiASMH)

Code:
.set IBAT0U, 528
.set IBAT0L, 529
.set IBAT1U, 530
.set IBAT1L, 531
.set IBAT2U, 532
.set IBAT2L, 533
.set IBAT3U, 534
.set IBAT3L, 535
.set IBAT4U, 560
.set IBAT4L, 561
.set IBAT5U, 562
.set IBAT5L, 563
.set IBAT6U, 564
.set IBAT6L, 565
.set IBAT7U, 566
.set IBAT7L, 567
.set DBAT0U, 536
.set DBAT0L, 537
.set DBAT1U, 538
.set DBAT1L, 539
.set DBAT2U, 540
.set DBAT2L, 541
.set DBAT3U, 542
.set DBAT3L, 543
.set DBAT4U, 568
.set DBAT4L, 569
.set DBAT5U, 570
.set DBAT5L, 571
.set DBAT6U, 572
.set DBAT6L, 573
.set DBAT7U, 574
.set DBAT7L, 585



Chapter 2: BAT Structure Upper 32 bits

Upper Portion:
  • Bits 0 - 14: BEPI
  • Bits 15 - 18: Unused
  • Bits 19 - 29: BL
  • Bit 30: Vs
  • Bit 31: Vp

BEPI: Block Effective Page Index. This is simply the beginning Virtual Address that you choose to use for the translation. If desired, you can set this to exactly match the Physical Address (for Wii games that is only done for what is called the L2 Cache). The ending Virtual Address is determined by the memory block size in the BL field.

BL: Block Length. This sets the amount (region/block) of Memory that the BAT will cover using BEPI as the starting address. Use the bit map below to determine the size.

BL Bit Map:
  • 128KB = 000 0000 0000
  • 256KB = 000 0000 0001
  • 512KB = 000 0000 0011
  • 1MB = 000 0000 0111
  • 2MB = 000 0000 1111
  • 4MB = 000 0001 1111
  • 8MB = 000 0011 1111
  • 16MB = 000 0111 1111
  • 32MB = 000 1111 1111
  • 64MB = 001 1111 1111
  • 128MB = 011 1111 1111
  • 256MB = 111 1111 1111

Setting an invalid bit combo will corrupt the BAT and cause undefined behavior. If the exact size you want is not listed, then you need round up or split the sizes and use multiple BATs.

Vs: Supervisor Valid Bit. If you are in Supervisor Mode (Machine State Register PR Bit Low), this bit will determine whether or not you are allowed to access that memory.

Vp: User Valid Bit. If you are in User Mode (Machine State Register PR bit High), this bit will determine whether or not you are allowed to access that memory.

Cheat Sheet for Upper Portion of BAT if using Vs & Vp as both set high.
XXXXzzzz

XXXX = Upper 16 bits of the starting Virtual Address (i.e. 0x8000 represents 0x80000000)
zzzz = BL + Vs&Vp combo (whenever Vs and Vp are set high)

zzzz value map (assumes Vs and Vp are set high)
  • 0003 = 128KB
  • 0007 = 256KB
  • 000F = 512KB
  • 001F = 1MB
  • 003F = 2MB
  • 007F = 4MB
  • 00FF = 8MB
  • 01FF = 16MB
  • 03FF = 32MB
  • 07FF = 64MB
  • 0FFF = 128MB
  • 1FFF = 256MB

Here's an example BAT Upper 32-bit Value using 8MB and 0x81000000 as the virtual start address; Vs and Vp both set high
0x810000FF

Btw here's a handy memory size cheat sheet to help calculate the zzzz values. Take the end address minus the start address, then add 1. All calculations are in Hex obviously..

Example: (0x80FFFFFF - 0x80000000) + 1 = 0x01000000  (size of 16MB; use zzzz value of 01FF)
  • 0x10000000 = 256MB
  • 0x08000000 = 128MB
  • 0x04000000 = 64MB
  • 0x02000000 = 32MB
  • 0x01000000 = 16MB
  • 0x00800000 = 8MB
  • 0x00400000 = 4MB
  • 0x00200000 = 2MB
  • 0x00100000 = 1MB (1,024KB actual)
  • 0x00080000 = 512KB
  • 0x00040000 = 256KB
  • 0x00020000 = 128KB
  • 0x00010000 = 64KB
  • 0x00008000 = 32KB
  • 0x00004000 = 16KB
  • 0x00002000 = 8KB
  • 0x00001000 = 4KB
  • 0x00000800 = 2KB
  • 0x00000400 = 1KB (1,024 bytes actual; 0x400 in hex is 1,024 in decimal)



Chapter 3: BAT Structure Lower 32 bits

Bits 0 - 14: BRPN
Bits 15 - 24: Unused
Bit 25: W
Bit 26: I
Bit 27: M
Bit 28: G
Bit 29: Unused
Bits 30 & 31: PP

BRPN: Block Register Page Number. This is the physical address that is used in conjuction to your Virtual Address, this must be a legit physical address.

W: Write Through. If this bit is high, any store operations to cached memory are also written to physical memory, think of this like a dcbst instruction. If the bit is low, this is known as 'write-back'. In Write-Back mode, when store operations update cached memory, they do not instantly update physical memory.

I: Cache-Inhibited. If this bit is high, no cache-ing of any kind shall occur for the specified block of memory. Blocks of memory that include access to something such as an I/O device should have the I bit set high.

M: Memory coherence: When this bit is high, other devices or processors will be notified whenever the specified region of memory is accessed. Considering Broadway runs on 'its own' and doesn't need to notify other processors, M bit is set low on every BAT by every Wii menu, channel, game, etc.

G: Guarded:  This bit should be high if there are missing gaps in the specified memory block. This bit should also be set high for any memory that is not "well behaved". "Well behaved" simply means if the block of memory includes access to something such as an I/O device. I'm not sure on this but you may also need the G bit high if you had to roundup the BL bits. That would technically cause gaps in the memory block. Also G bit should be set high if you want to stop out-of-order operations (i.e loads, fetches, etc). Fyi, stores on Broadway are never done out-of-order in relation to each other. Finally, Store Gathering is automatically disabled (regardless of SGE bit value in HID0) whenever G bit is high.

PP: Page Protection. Responsible for allowing access to the block of memory. Think of it like a firewall. See bit map below.

PP bit map:
  • 00 = No access (can't read or write)
  • x1 = Read only (x = don't care value)
  • 10 = Read & Write

There is no available bit combo for Write only!

If you are wanting a block of memory to allow full access in any 'situation', set both the Vs and Vp bits high in the upper portion of the BAT, and then set the PP bit combo to 10 for the lower portion.

Here's an example of a BAT with its lower portion using a physical start address 0x01000000 with WIMG all low, and PP set to 10. We will use the upper portion example from Chapter 2 that used Virtual Start Address 0x81000000. The BAT will have Vs+Vp high with PP as 10 which means both user and supervisor have read+write access to the block of memory

BAT value: 0x810000FF 01000002

Virtual Start Address: 0x81000000
Block Length: 8MB
Vs and Vp high
Physical Start Address: 0x01000000
WIMG all low
PP = 10 (0x2); Read & Write

IMPORTANT NOTEs: The IBATs can NEVER have the W and/or G bit set high. Doing so will corrupt the IBAT. Undefined behavior will occur. Also, regarding DBATs, whenever I bit is high, there is zero need to set W bit high. Having both W and I bit high (only applicable to DBATs) is considered as an invalid combo and should not be used. If you want to set any DBAT to simply be a representative of Physical Memory (but with the ability to use it Virtually), set I and G bits high, with W low.



Chapter 4: Getting In & Out of Real Mode

You must be in Real Mode when doing any invalidations/modifications to the BATs. Real Mode simply means you are executing in Physical Memory that is NOT part of an Exception. However, you do NOT need to be in Real Mode if you are simply copying BAT Register Data to the GPRs (maybe for something such as a Debug Report type code/source). Also, interrupts need to be disabled the entire time you are invalidating and/or modifying the BATs.

Example Routine to get into Physical Mode~

Code:
mfmsr r3 #Backup original MSR to some safe place if necessary
rlwinm r4, r3, 0, 17, 15
mtmsr r4

bl get_pc #Get Program counter
get_pc:
mflr r3

margin = real_mode - get_pc

addi r3, r3, margin #This points to instruction right after rfi
clrlwi r3, r3, 1 #Change address to physical
mtspr srr0, r3 #Place physical address into srr0
rlwinm r4, r4, 0, 28, 25 #Flip off IR and DR bits
mtspr srr1, r3 #Place updated MSR into srr1
rfi #Go into real mode

real_mode:


Example Routine to get back into Virtual Mode~

Code:
bl get_pc #Get Program Counter
get_pc:
mflr r4

margin = virtual_mode - get_pc

mfmsr r3 #Get MSR
ori r3, r3, 0x8030 #Flip IR and DR bits high, turn back on Interrupts
mtspr srr1, r3 #Place updated MSR into srr1
addi r3, r4, margin #Points to instruction right after rfi
oris r3, r3, 0x8000 #Change address to Virtual, mem80 used for this example
mtspr srr0, r3 #Place virtual address into srr0
rfi #End real mode

virtual_mode:

Alternatively instead of placing the address in srr0 and MSR into srr1, you could do something like this (do NOT use this for Exceptions btw)...

Code:
lis r0 #Will be your physical or virtual address
ori r0, r0, 0xXXXX
mtlr r0 #or mtctr if you want to use the CTR
mfmsr r0
rlwinm / ori using r0 #Depends on if you are setting IR+DR high or low
mtmsr r0
isync** #This is NEEDED!
blr #or bctr if you chose to use the CTR

**isync is needed to update the instruction context. If isync isn't used (remember Broadway is an out-of-order execution-type CPU), Broadway may fetch and execute instructions meant for virtual mode as real mode, and vice versa. The earlier two sources that use rfi do not need an isync because rfi, by design, is instruction synchronizing.



Chapter 5: Invalidating BATs

To modify a BAT, it MUST be invalidated first. Also when Broadway is powered on, the BATs are in an unknown state. All BATs should be marked invalid before configuring them (necessary for something such as writing a boot sequence code).

To invalidate a BAT, both Vs and Vp bits of the Upper Portion must be set low. Super simple. The quickest way to invalidate a BAT is this...

Code:
li rX, 0
mtspr ZZZU, rX

rX = Safe GPR to use for your code/source
ZZZ = the SPR number of the BAT
U = reminding you the write needs to be done the the UPPER portion

Any modification/invalidation to an IBAT requires an isync instruction AFTERWARDS (Reference: PowerPC Microprocessor Family: The Programming Environments, table 2-22 (page 2-42))

Any modification/invalidation to a DBAT requires isync instructions BEFORE and AFTER (Reference: PowerPC Microprocessor Family: The Programming Environments, table 2-23 (page 2-43))

If you are invalidating a group of IBATS or DBATS, you do not need an isync after every single individual BAT invalidation.

Example: Invalidate DBATS 2 thru 4~

Code:
li r0, 0
isync
mtspr r0, DBAT2U
mtspr r0, DBAT3U
mtspr r0, DBAT4U
isync

Since IBAT invalidation doesn't require an isync beforehand, if you are writing a boot sequence code, you can invalidate all BATs simply like this...

Code:
li r0, 0
mtspr IBAT0U, r0
mtspr IBAT1U, r0
mtspr IBAT2U, r0
mtpsr IBAT3U, r0
mtspr IBAT4U, r0
mtspr IBAT5U, r0
mtspr IBAT6U, r0
mtspr IBAT7U, r0
isync
mtspr DBAT0U, r0
mtspr DBAT1U, r0
mtspr DBAT2U, r0
mtspr DBAT3U, r0
mtpsr DBAT4U, r0
mtspr DBAT5U, r0
mtpsr DBAT6U, r0
mtspr DBAT7U, r0
isync



Chapter 6: Modifying BATs Correctly

Assuming you have invalidated a BAT correctly, you can now modify it. The Lower 32-bits of the BAT MUST ALWAYS BE WRITTEN FIRST! Another rule is that you can only modify one BAT at a time.

Example: Configure the 5th IBAT to have virtual address 0x80000000 represent physical address 0x00000000. 0x80000000 thru 0x80FFFFFF is 16MB in total size which is a perfect size match for setting the BL bits. Vs, Vp both High. PP = 10 (0x2). WIMG is all low.

Code:
#Pretend the IBAT was properly invalidated some time ago. Fyi no isync required beforehand since this is an IBAT
#Set r0 as lower portion
li r0, 2

#Set r3 as upper portion
lis r3, 0x8000
ori r3, r3, 0x01FF

#Write lower portion
mtspr IBAT5L, r0

#Write upper portion
mtspr IBAT5U, r0

#isync required now
isync

One more example, set one BAT (DBAT2) to cover Virtual Uncached Mem2 (aka mem9). Mem9's full size is 64MB which is a perfect size match once again for the BL bits. Assume DBAT2 was properly invalidated beforehand.

Code:
#W = low
#I = high! (cache must be blocked)
#M = low
#G = high! (prevent out-of-order operations)
#PP = 10 (read & write)
lis r0, 0x1000
ori r0, r0, 0x002A

#Upper Portion config
#Virtual Start Addr = 0xD0000000
#Block size = 64MB
#Vs and Vp both set high
lis r3, 0xD000
ori r3, r3, 0x07FF

#Modify the BAT
isync
mtspr DBAT2L, r0
mtspr DBAT2L, r3
isync



Chapter 7: More Rules to Follow

You cannot have multiple BATs (for the BAT same type; instruction or data), that do a translation to the same physical address.

Example: You have IBAT0 translate 0x80000000 to physical 0x00000000 and IBAT6 translate 0x70000000 to physical 0x00000000. That's a no-go.

Also, you cannot have memory blocks overlap from multiple BATs (for the same BAT type: instruction or data).

Example: You have DBAT2 for 0x90000000 using block-length of 64MB (so this covers all of mem9 which is 0x90000000 thru 0x93FFFFFF. You then also have DBAT4 for 0x93000000 using block length of 16MB (so this covers just 0x93000000 thru 0x93FFFFFF). As a result memory addresses 0x93000000 thru 0x93FFFFFF are both part of DBAT2 and DBAT4.

Whenever BATs are incorrectly invalidated and/or modified, the processor may enter into what is known as a Checkstop Condition. In simple terms, the processor halts without any Exception Routine being taken.



Chapter 8: Hardware Register Notes

By default (Broadway from a powered-on state), IBATs & DBATs 4 thru 7 are disabled. To enable these extra BATs, you need to set Bit 6 (SBE) of HID4 high. From personal limited testing, it appears all Wii applications that make modifications to HID4 will always set/keep this bit high

All BAT M bits (for instruction fetching) can be overridden by setting bit 23 (IFEM) low on HID0. From personal limited testing, it appears all Wii applications set/keep this bit low.

Please note that the I bit setting in a BAT will override the ICE, DCE, ILOCK, & DLOCK bits of HID0!!! (reference: Broadway User Manual pages 136 & 137)



Chapter 9: Example Code of Messing Around with the BATs

Here is a code I've clobbered together really quick that will run a couple of instructions in 0x50XXXXXX memory. This will most likely crash on most versions of Dolphin, because Dolphin is Dolphin. The code runs fine on real Hardware.

If Dolphin does work for you then you will see (by stepping thru the code) that you will arrive at 0x50XXXXXX memory to execute two instructions. They set r4 to 0, then increment it by 1. In Dolphin's code view it will say something like.... "No RAM Contents Here". You won't be able to see the instructions in the Code View, but as you are stepping, you will see r4 get the proper updates in the Register tab/window.

Code is hooked at old historical Shared Item Code Address (MKWii game ofc, credits to Guru for original Shared Item Code). Hit (or let a CPU) hit an item box for the code to execute.

Btw to know that the code works on real Hardware, a Fib is set to be received from the Item Box when you (or the CPU) hits said item box.

PAL
C27BA164 0000001C
7D8000A6 5583045E
7C600124 48000005
7D6802A6 386B0020
5463007E 7C7A03A6
7C6000A6 54630732
7C7B03A6 4C000064
38000000 7C1083A6
4C00012C 38000002
3C605000 606301FF
7C1183A6 7C7083A6
4C00012C 7C6000A6
60630030 7C7B03A6
5563007E 64635000
38630064 7C7A03A6
4C000064 38800000
38840001 386B0088
5463013E 7C7A03A6
7C6000A6 54630732
7C7B03A6 4C000064
38000000 7C1083A6
4C00012C 38000002
3C608000 606301FF
7C1183A6 7C7083A6
4C00012C 7D9B03A6
5563013E 64638000
386300C4 7C7A03A6
4C000064 38600003
90770020 00000000

Code:
#START ASSEMBLY

#Address
#PAL = 807BA164

#Register Notes
#r0, r3, r4, & LR are safe

#Disable interrupts
mfmsr r12
rlwinm r3, r12, 0, 17, 15
mtmsr r3

#Go into real Mode
bl get_pc #Get Program counter
get_pc:
mflr r11 #Will need this value later to get back to Virtual Mode

margin = real_mode - get_pc

addi r3, r11, margin #This points to instruction right after rfi
clrlwi r3, r3, 1 #Change address to physical
mtspr srr0, r3 #Place physical address into srr0
mfmsr r3 #Get MSR
rlwinm r3, r3, 0, 28, 25 #Flip off IR and DR bits
mtspr srr1, r3 #Place updated MSR into srr1
rfi #Go into real mode

#############################

real_mode:

#Invalidate IBAT0 (Vs and Vp of Upper portion must be low)
li r0, 0
mtibatu 0, r0
isync

#Set r0 as lower portion
li r0, 2 #PP bits as 10, WIMG all low

#Set r3 as upper portion
lis r3, 0x5000 #Start block at EA 0x50000000
ori r3, r3, 0x01FF #Size of 16MB

#Modify the IBAT!, lower portion (r0) must be written first
mtibatl 0, r0
mtibatu 0, r3
isync

#Go to Vritual Mode
virt_margin = virtual_mode - get_pc

#Turn back on IR and DR, do not turn on EE just yet!
mfmsr r3
ori r3, r3, 0x0030
mtspr srr1, r3

#Clear bit 0 of r11
clrlwi r3, r11, 1

#OR it with 0x5000
oris r3, r3, 0x5000
addi r3, r3, virt_margin
mtspr srr0, r3
rfi

#############################

#We are now executing in 0x50xxxxxx memory!!!
virtual_mode:

#Execute random instructions to prove to Dolphin User this is working
li r4, 0
addi r4, r4, 1

#Go back into real mode again
second_phys_margin = second_real_mode - get_pc

addi r3, r11, second_phys_margin
clrlwi r3, r3, 4 #Change address to physical
mtspr srr0, r3 #Place physical address into srr0
mfmsr r3 #Get MSR
rlwinm r3, r3, 0, 28, 25 #Flip off IR and DR bits
mtspr srr1, r3 #Place updated MSR into srr1
rfi #Go into real mode

#############################

second_real_mode:

#Invalidate IBAT0 again
li r0, 0
mtibatu 0, r0
isync

#Set r0 as lower portion
li r0, 2

#Set r3 as upper portion
lis r3, 0x8000
ori r3, r3, 0x01FF

#Modify the IBAT!, lower portion (r0) must be written first
mtibatl 0, r0
mtibatu 0, r3
isync

#Go to Vritual Mode
second_virt_margin = second_virtual_mode - get_pc

#Restore very first original MSR (r12)
mtspr srr1, r12

#Clear bits 0 thru 3 of r11
clrlwi r3, r11, 4

#OR it with 0x8000
oris r3, r3, 0x8000
addi r3, r3, second_virt_margin
mtspr srr0, r3
rfi

#############################

second_virtual_mode:
#Set fib
li r3, 3

#Default/Original Instruction
stw r3, 0x0020 (r23)



Chapter 10: Example Source of Setting Up all BATs for Wii; Conclusion

Here is a template you can use to invalidate then properly setup all BATs that the Wii needs to use. It assumes you are already in Real Mode and that r0 & r3 are safe to use.

Code:
#Invalidate all BATs
li r0, 0
mtspr IBAT0U, r0
mtspr IBAT1U, r0
mtspr IBAT2U, r0
mtspr IBAT3U, r0
mtspr IBAT4U, r0
mtspr IBAT5U, r0
mtspr IBAT6U, r0
mtspr IBAT7U, r0
isync
mtspr DBAT0U, r0
mtspr DBAT1U, r0
mtspr DBAT2U, r0
mtspr DBAT3U, r0
mtspr DBAT4U, r0
mtspr DBAT5U, r0
mtspr DBAT6U, r0
mtspr DBAT7U, r0
isync

#Setup BATs, all BATs will be enabled read+write for both user+supervisor

#Setup IBATs for...
#Cached Mem80; 16MB; WIMG 0000
#Cached Mem81; 8MB; WIMG 0000
#Cached Mem9; 64MB; WIMG 0000
#Uncached Mem80 (MemC); 16MB; WIMG 0101
#Uncached Mem81 (MemC1); 8 MB; WIMG 0101
#Uncached Mem9 (MemD); 64MB; WIMG 0101

#Setup DBATs for...
#Cached Mem80; 16MB; WIMG 0000
#Cached Mem81; 8MB; WIMG 0000
#Cached Mem9; 64MB; WIMG 0000
#Uncached Mem80; Mem81; Mem9; Hardware Memory (all of MemC)); 256MB; WIMG 0101
#Gaurded L2 Cache (MemE); 256KB; WIMG 0001

#First 16MB of Cache MEM1; WIMG 0000
lis r0, 0x8000
ori r0, r0, 0x01FF
li r3, 0x0002
mtspr IBAT0L, r3
mtspr IBAT0U, r0
isync
mtspr DBAT0L, r3
mtspr DBAT0U, r0
isync
   
#Second 8MB of Cache MEM1; WIMG 0000
lis r0, 0x8100
ori r0, r0, 0x00FF
#Change 0x00000002 to 0x01000002
oris r3, r3, 0x0100
mtspr IBAT2L, r3
mtspr IBAT2U, r0
isync
mtspr DBAT2L, r3
mtspr DBAT2U, r0
isync
   
#64MB of Cache MEM2; WIMG 0000
lis r0, 0x9000
ori r0, r0, 0x07FF
lis r3, 0x1000
ori r3, r3, 0x0002
mtspr IBAT4L, r3
mtspr IBAT4U, r0
isync
mtspr DBAT4L, r3
mtspr DBAT4U, r0
isync
   
#64MB of Uncache MEM2; WIMG 0101 (Cache Inhibited, Guarded)
#Change 0x900007FF to 0xD00007FF
oris r0, r0, 0xD000
ori r3, r3, 0x002A
mtspr DBAT5L, r3
mtspr DBAT5U, r0
isync
   
#Uncache MEM1 & Hardware Memory; WIMG 0101 (Cache Inhibited, Guarded)
#Lower 32-bits already present in r3 (0x0000002A)
lis r0, 0xC000
ori r0, r0, 0x1FFF
li r3, 0x002A
mtspr DBAT1L, r3
mtspr DBAT1U, r0
isync
   
#DBAT for the Locked Cache (identical translation used)
lis r0, 0xE000 #Virtual Addr is same as Physical Addr
ori r0, r0, 0x0003 #Locked Cache is 16KB in size; set usage by both User and Supervisor
lis r3, 0xE000
ori r3, r3, 0x0002
mtspr DBAT3L, r3
mtspr DBAT3U, r0
isync

The source uses certain specific BATs so it can be compatible with being 'injected' at any point in a Wii Game/Title or in a Boot Sequence. Keep in mind that Wii Games setup the Locked Cache BAT immediately *after* the Strap Screen, which will override the DBAT3 config of the above source if you inject it before that moment. But it doesn't really matter because the Locked Cache BAT (DBAT3) can be any size due to 2 reasons: The 16KB locked Cache size is covered by the bare minimum bat size config, and the size of this BAT can be anything due to the nature of how Locked Cache operates. Locked Cache is a very 'odd' feature of Broadway, such as it uses a fake Physical Address in the BAT. It's hard to explain why. Maybe one day, I'll make a tutorial about that to clear some confusion.

Print this item

  Basic GVR Usage in ASM Codes
Posted by: Vega - 05-21-2022, 11:04 PM - Forum: PowerPC Assembly - No Replies

Basic GVR Usage in ASM Codes

For Beginner ASM Coders.



Chapter 1: Intro; Explaining more about C2 Codes

C2 'Insert ASM' Codes are executed by the Code Handler when its Hook Address is executed by Broadway. For a full refresher regarding this, reread Chapter 3 of this thread -> https://mariokartwii.com/showthread.php?tid=1383

A C2 ASM Code can be executed for game specific list/cycle of circumstances. What do I mean by that?

Well for Mario Kart Wii, many addresses that are used by C2 Codes will be executed in a cycle for each player/CPU. An address will be executed for Player 1. Then next time that the game executes said address, it will be for Player 2. Then Player 3, 4, 5, etc etc.

For a game like Dragon Ball Z Budokai Tenkaichi 3 (where a match is typically Player 1 vs Player 2/COM), an address can be executed first for Player 1, then next time it will be for Player 2/COM.

Understanding this is crucial to reaching to that 'next level' of ASM Coding. This may not seem important at first, but this will open up a variety of possibilities for your future Codes.



Chapter 2: What is a GVR? Why do values in the GVRs matter?

GVR = Global Variable Register

It is any General Purpose Register that is r14 thru r31. GVRs will contain values such as...
  • Player Number/Slot
  • CPU Number/Slot
  • Course Number/Slot
  • Menu/Mode Type Number/Slot
  • etc

Therefore during the time when your ASM Code is executing, the GVRs may contain data that is relevant to improving your Code. By the way, the term "slot" is preferred by most Coders over the term "number". Regarding Player Slots, for the majority of cases, Slot value of 0 represents Player 1 (1st non-CPU Player).

By figuring out which GVR contains a specific slot value, you can modify your ASM Codes to only execute when said GVR has a certain value. A big issue for new Coders is creating a code and running into an issue where it applies to both the Human Player and the CPU (when the Coder only wants their Code to execute for the Human Player). At this point, if said Coder isn't familiar with GVR data, he/she will have literally zero idea on how to fix their code.



Chapter 3: Let's Make a Code using GVR Data; Part 1

We will use the old historical version of the Mario Kart Wii Shared Item Code (PAL) for demonstrating GVR fundamentals

Code:
C27BA164 00000002
386000WW 90770020
60000000 00000000

This code simply changes what item you receive from the item box when you pick it up. WW = Item to Receive, we will set this to 00 (Green Shell) for this GVR tutorial.

Code:
C27BA164 00000002
38600000 90770020
60000000 00000000

Now let's inspect the source...

Code:
li r3, 0 #Set Item (green shell)
stw r3, 0x0020 (r23) #Default/Original Instruction of Insertion/Hook Address. This stores the Item to Dynamic Memory.

At this moment we have no idea what data is represented in any of the GVRs. If you have MKWii, launch it on Dolphin. If not, be sure to examine the following screenshots closely to follow along. In MKWii, do a normal Offline Race. Simply pick Luigi Circuit (as that is the track I did for the exercise, and this plays an important role to the explanation of the final screenshot in this chapter). Character+Vehicle combo doesn't matter. Sometime before the Race begins, pause the emulation & set an Instruction BP on Address 807BA164 (do NOT have the Shared Item Code equipped!)

[Image: gvr01.png]

Once you have the Breakpoint set go ahead and start the race, but wait at the start line. When the first CPU hits the item box, the BP will break/hit. Note down the r14 thru r31 data that you see.

[Image: gvr02.png]

Once you have that all noted down, while keeping the Breakpoint ON, resume the emulation. It will pause again once another CPU hits an item box. Take a look at r14 thru r31. If any values changed from the previous Instruction BP hit, they will be in Red font. Note down the Red font GVR values.

[Image: gvr03.png]

While still keeping the Breakpoint ON, resume the emulation again. It will pause again after another Item Box hit. Take a look at the previous BP-hit Red font GVR values. How do they differ from the previous BP-hit? Note this all down (including their new current values).

[Image: gvr04.png]

In the above screen shot, there is something that sort of has a pattern to it. It's r25. We see that it has been incrementing (1 to 2 to 4). Fyi this WILL differ slightly on your own test. But you will see that r25 is indeed incrementing.

Go ahead and repeat the process until all CPUs have passed the first row of item boxes. Once they have pass that first row, go ahead and manually pause the emulation to prevent any further item box hits.

When I did this (for providing pics for this tutorial), here were the rest of the screenshots.

[Image: gvr05.png]

[Image: gvr06.png]

You can see that r25 is definitely incrementing. What could it be? Take a moment to think about it. It CANNOT be slot value, because the odds of it incrementing that many times in a row is extremely low.

Well we know the first CPU to hit the box is VERY likely to be in first place. In fact, said CPU WILL be 1st place unless said CPU somehow missed picking up the very first box in the race.

Thus, we can conclude that r25 represents the Player's/CPU's Position in the Race (at the time when they picked up a box).

To further prove this point, here is the next screenshot (when I did this exercise). All CPUs have already went past the first set of boxes (on Luigi Circuit), and now we can see from the screenshot that 1st Place CPU has picked up the first box on the next item box row on the track.

[Image: gvr07.png]

And the screenshot proves to us that r25 is 1, as expected. So now we know for 100% certainty that r25 = Position. 1 = 1st Place, 2 = 2nd Place, etc etc til C = 12th (last) Place.



Chapter 4: Let's Make a Code using GVR Data; Part 2

Awesome, we have some GVR data in which we can make a code with. Let's say you only wanted the Shared Item Code to execute when the Player/CPU is in last (12th place). Before we have anything at all execute in this new code we are writing, the 'check' for r25's value needs to be first.

We will use a simply Compare Word Immediate instruction. Now, since we only want the Code to execute when CPU is in 12th place, we will obviously check r25's value against the value of 0xC (12).

Code:
cmpwi r25, 0xC #12

We have a comparison instruction written out, now for the Branch instruction. Well, we know that we do NOT want the code to execute if r25 isnt' 12, so we will be using a Branch-If-Not-Equal instruction.

Code:
cmpwi r25, 0xC
bne

Since there are 12 different total positions, the odds of r25 NOT being 12 is high, thus we can supply a + symbol to the branch instruction to give a small hint to Broadway that the Branch Instruction/Jump will most likely occur.

Code:
cmpwi r25, 0xC
bne+

Now we need a label name. We will simply call it jump_code.

Code:
cmpwi r25, 0xC
bne+ jump_code

Underneath the branch instruction, we will put the instructions from the original Shared Item Code. Therefore if r25 is equal to 0xC (12), it will NOT take the branch jump and execute the instructions directly below.

Code:
cmpwi r25, 0xC #Check if in Last Place
bne+ jump_code #If not, don't execute code

li r3, 0 #Set Green Shell
stw r3, 0x0020 (r23) #Default/original instruction

At this point, you may be thinking to place the branch's jump 'landing spot' label at the very end, and the code is complete. However that would be wrong, we will write it out below and I will explain why that is incorrect to do.

Code:
cmpwi r25, 0xC #Check if in Last Place
bne+ jump_code #If not, don't execute code

li r3, 0 #Set Green Shell
stw r3, 0x0020 (r23) #Default/original instruction

jump_code:

Why is this wrong? Well, what happens if r25 isn't 12 and the branch is taken? You will completely skip over the Default Instruction. We need that instruction, so the r3 value (Item value from Box) will always get stored to Dynamic Memory. With that in mine, we need the 'jump_code:' label right before the Default instruction. Like this...

Code:
cmpwi r25, 0xC #Check if in Last Place
bne+ jump_code #If not, don't execute code

li r3, 0 #Set Green Shell

jump_code:
stw r3, 0x0020 (r23) #Default/original instruction

Now compile that code using Insertion Address (PAL) 0x807BA164

Code:
C27AB704 00000003
2C19000C 40A20008
38600000 90770020
60000000 00000000

We know from earlier that the '38600000' line is the 'li r3, 0' instruction. We can change that back to 38600WW to allow the end user to set a custom Item Value for the Code.

The first instruction of our new code is the cmpwi r25, 0xC. We can also configure that compiled line of code to end-user fill-able, we will use a P value for that.

PAL Positional Shared Item Code
P = Position Required
WW = Item to Receive

C27AB704 00000003
2C19000P 40A20008
386000WW 90770020
60000000 00000000


Congrats, you've just remade Star's Positional Shared Item Code! Link - https://mariokartwii.com/showthread.php?tid=536

Print this item

  No DC after selecting char/vehicle/drift type
Posted by: dirtyfrikandel - 05-19-2022, 08:18 PM - Forum: Code Support / Help / Requests - No Replies

Hi,

When I run this code (https://mariokartwii.com/showthread.php?tid=154) in a friend room, after selecting character/vehicle/drift type, you get the "please wait a moment screen". If my friends are not fast enough, I get disconnected. This happens after roughly 3:30 (3 and a half minutes).
Does anyone know what causes this and if it can be fixed with a code?

Kind regards,
Dirtyfrikandel.

p.s. this forum is awesome! it has helped me a ton with setting up my lan party.

Print this item

  Broadway SPR Rules + Notes + Mini Guides + Misc Random Stuff
Posted by: Vega - 05-15-2022, 02:37 PM - Forum: Resources and References - No Replies

Broadway SPR Rules + Notes + Mini Guides + Misc Random Stuff

I have made this document when I needed a quick reference card to messing around with Broadways' special purpose registers. Might as well share it for anyone who needs it.



Sync and isync are synchronizing instructions. They play a huge role when modifying the Hardware Registers and certain other Special Purpose Registers.

Times when sync or isync are required~
1. sync is required BEFORE & AFTER clearing the L2E bit of the L2CR
2. isync is required BEFORE & AFTER modifying the ICE bit of HID0
3. isync is required BEFORE modifying ILOCK bit of HID0
4. sync is required BEFORE & AFTER modifying DCE bit of HID0**
5. sync is required BEFORE modifying DLOCK bit of HID0
6. isync is required AFTER any mtsr instruction when the segment registers effects an Instruction EA
7. isync is required AFTER any mtsrin instruction when the segment register(s) effects the Instruction EA
8. isync is required BEFORE & AFTER any mtsr instruction when the segment registers effect a Data EA
9. iysnc is required BEFORE & AFTER any mtsrin instruction when the segment registers effect a Data EA
10. sync is required BEFORE any modification to SDR1
11. isync is required AFTER any modification of SDR1
12. sync is required BEFORE modifying the POW bit of the MSR
13. isync is required AFTER modifying ANY of the following MSR bits....
POW, PR, ME, FP, SE, BE, IR, DR, RI, FE0, FE1
14. sync is required AFTER modifying the L2FM bit of HID4
15. sync is required AFTER reading the DMAQL bit(s) of HID2; you cannot write to these bits btw
16. sync is required AFTER writing the F bit high in the DMAL Register
17. isync is required AFTER writing to the IABR
18. sync is required AFTER writing to the DABR
19. isync is required AFTER modifying an IBAT Register***
20. isync is required BEFORE & AFTER modifying a DBAT Register****
21. isync is required BEFORE a tlbie instruction if the TLB effects Data
22. isync is required AFTER a tlbie instruction where the TLB effects Instructions
23. sync is required AFTER a tblie instruction if the TLB effects Data
24. tlbsync is required AFTER a TLB invalidation

***You can have multiple IBAT Registers be modified consecutively, with a single isync at the end.
****You can have multiple DBAT Registers be modified consecutively, with a single isync at the beginning & end


References for above isync/sync rules~
1. Broadway User Manual page 319.
2. Broadway User Manual page 137 for the 'BEFORE'. Broadway User Manual page 319 for the 'AFTER'
3. Broadway User Manual pages 60 & 137.
4. Broadway User Manual page 136 for the 'BEFORE'. Regarding the 'AFTER' there is no reference. Not mentioned in the Broadway manual but its a high probability since you are changing how Data Cache is executed.
5. Broadway User Manual pages 60 & 136.
6. PowerPC Microprocessor Family: The Programming Environments, table 2-23 (page 2-43)
7. PowerPC Microprocessor Family: The Programming Environments, table 2-23 (page 2-43)
8. PowerPC Microprocessor Family: The Programming Environments, table 2-22 (page 2-41)
9. PowerPC Microprocessor Family: The Programming Environments, table 2-22 (page 2-41)
10. PowerPC Microprocessor Family: The Programming Environments, table 2-22 (page 2-42)
11. PowerPC Microprocessor Family: The Programming Environments, table 2-22 (page 2-42)
12. Broadway User Manual page 330
13. POW Bit - Broadway User Manual page 330
13. PR Bit - Broadway User Manual page 89
13. All other bits: PowerPC Microprocessor Family: The Programming Environments, tables 2-22 &  2-23 (pages 2-41 & 2-43)
14. Broadway User Manual Page 66
15. Broadway User Manual Page 66
16. Broadway User Manual Page 324
17. Broadway User Manual Page 177
18. No reference. High probability. Since an isync is required for the IABR, it would make sense for a sync to be required for the DABR.
19. PowerPC Microprocessor Family: The Programming Environments, table 2-23 (page 2-43)
20. PowerPC Microprocessor Family: The Programming Environments, table 2-22 (page 2-42)
21. PowerPC Microprocessor Family: The Programming Environments, table 2-22 (page 2-42)
22. PowerPC Microprocessor Family: The Programming Environments, table 2-23 (pages 2-43 & 2-44)
23. PowerPC Microprocessor Family: The Programming Environments, table 2-23 (pages 2-43 & 2-44)
24. Broadway User Manual Page 180

NOTE: From everything I have read so far on the Broadway Manual & other PPC manuals, there is never a circumstance that ever requires a back-to-back sync nor a circumstance that ever requires a back-to-back isync.

Other HID0 bit rules:
1. Never have SGE bit on whenever the Write Pipe is enabled (reference: Broadway User Manual page 326)

HID2 bit rules:
1. The entire Instruction Cache must be disabled then invalidated before modifying the LSQE, PSE, and/or LCE bits. (reference: Broadway User Manual page 64)



HID4 rules:
You cannot modify any HID4 L2CR related bits while L2CR is on!

Bit 0 of HID4 must always be written as high (1), it will always read as 1 (reference: Broadway User Manual page 65).

Once bits 3 & 4 are set, its value cannot be lowered (reference: Broadway User Manual page 66)
Bit 3 & 4 settings:
00 = bus max depth of 2
01 = depth of 3
10 = depth of 4
11 = reserved/unused

Thus if the setting is 10, it cannot be changed at all.

The L2FM field (bits 1 & 2) follow this same rule, can also not be lowered once set (reference: Broadway User Manual page 66)





L2CR Mini Guides:
NOTE: Interrupts should always be masked (disabled) during any L2CR operations.

Following guides are referenced using Section 9.1.3 (Page 318), and Section 9.1.4 (Page 319) of the Broadway User Manual,

Guide for L2 Cache Global Invalidation:
*Interrupts must be masked (disabled) throughout this entire process
*DPM bit of HID0 must be low throughout this entire process
1. Disable the L2CR by setting the L2E bit low
2. Initiate the Global Invalidation by setting the L2I bit high #Steps 1 and 2 MUST be done separately!
3. Run a Loop that constantly checks the L2IP bit. Once that bit is low, the Invalidation has been completed.

Guide to Initialize L2CR:
*Interrupts must be masked (disabled) througout this entire process
*DPM bit of HID0 must be low throughout this entire process
1. Globally Invalidate the L2 Cache (see above guide)
2. Disable L1 instruction cache of HID0
3. Turn on the L2CR by setting the L2E bit high
4. Restore L1 instruction cache of HID0

Guide to Configure L2CR:
*Interrupts must be masked (disabled) throughout this entire process
*DPM bit of HID0 must be low throughout this entire process
1. Turn on the L2CR (see above guide)
2. Disable L1 instruction cache of HID0
3. Set L2CR L2DO bit high #example bit, can be a diff config bit of your choice
4. Restore L1 instruction cache of HID0

Guide to Turn off L2CR:
*Interrupt must be masked (disabled) throughout this entire process
1. Simply set the L2E bit low
*If you plan on invalidating or re-enabling L2CR after this, DPM bit in HID0 must be low before you start the invalidation or before you re-enable.



Guide to go into Reduced Power Mode (doze, nap, sleep)
*Interrupts must be ON! for the entire guide
1. Set Desired Power Mode bit high on HID0.
2. Flip POW bit high in the MSR (don't forget your sync before and isync after!)
3. Broadway will enter new power mode in a few clock cycles

Reference: Chapter 10.2 of the Broadway User Manual (pages 327 thru 330)

Reduced Power Mode options:
Doze: Time Base & Decrementer still work
Nap: Time Base & Decrementer still work
Sleep: Time Base & Decrementer do NOT work

You can get out of Doze and Nap mode by setting the Decrementer to the desired value before going into said power mode. Once Decrementer goes below 0, the Decrementer exception will run. Be sure to write some custom code in the Decrementer Exception to get you back to full power mode and back to normal operations.



Miscellaneous fun fact~

While in supervisor mode (MSR: PR bit low) any mtspr instruction involving HID1 or PVR will execute as a nop (reference Broadway User Manual page 90).

Print this item

  Utilizing the Condition Register
Posted by: Vega - 05-15-2022, 01:07 PM - Forum: PowerPC Assembly - No Replies

Utilizing the Condition Register

For Advanced ASM Coders



Chapter 1: Intro

Requirements:
  • Understand the basics of using compare and branch instructions
  • Understand binary/bits + Logical Operations

Are your codes ending up with "countless" branches and branch labels? Your codes are in need of some spring cleaning. Sometimes codes that have an excessive amount branches end up "unreadable", making it difficult for others to understand your code or help you debug any errors.



Chapter 2: Condition Register Fundamentals

When you execute any plane jane comparison instruction such as....

Code:
cmpwi r0, 100

...you are actually telling Broadway to run a check and then place the result of said check in Condition Register Field 0.

What is Condition Register Field 0? First thing's first. The register you see in Dolphin that is named "CR" is the Condition Register. It contains the results of previously executed Compare Instructions. Conditional Branch Instructions (i.e. beq) read the data of the Condition Register to determine whether or not a branch route/jump is taken.

The Condition Register contains 7 fields. Field 0 (cr0) thru Field 7 (cr7).

STUVWXYZ

S = cr0
T = cr1
U = cr2
etc etc..

Each field (crF) takes up one DIGIT (half-byte) in the CR. Thus, each crF contains 4 bits of data. You can specify which crF to place the result of the compare instruction in. By default, if no crF is specified in your compare instruction, then cr0 will be used.

Code:
cmpwi r0, 100

is short for...

Code:
cmpwi cr0, r0, 100

If you wish to use cr7 instead of cr0, you would write the instruction like this...

Code:
cmpwi cr7, r0, 100

An important thing that you must keep in mind is that if you make a comparison that is NOT using cr0, you must also specify the crF in the subsequent branch instruction.

Like this...

Code:
cmpwi cr7, r0, 100
beq- cr7, store_data #Notice the specification of cr7 in the instruction

In conclusion, any crF that isn't cr0 must be specified in both compare and branch instructions.



Chapter 3: Condition Register Field Bits and Examining Branch Instructions

Now that you know there are 7 crF's and how to use each one in your comparison + branch instructions, let's cover the crF bits and what each bit represents.

Each crF has 4 bits of data that uses the following structure.
  • bit 0 = Less-Than flag (LT)
  • bit 1 = Greater-Than flag (GT)
  • bit 2 = Equal flag (EQ)
  • bit 3 = Summary Overflow flag (SO)

CR Bit Table
LT GT EQ SO  crfX
0  1  2  3  crf0
4  5  6  7  crf1
8  9 10 11  crf2
12 13 14 15  crf3
16 17 18 19  crf4
20 21 22 23  crf5
24 25 26 27  crf6
28 29 30 31  crf7

Whenever a bit in the crF is high, the condition was true FROM THE MOST RECENT comparison instruction. Whenever a bit was low, the condition was false FROM THE MOST RECENT comparison.

Multiple bits can be flagged high and/or low from a comparison instruction. Now that you understand the crF bits, let's go over what branch instructions actually do.

Code:
bge (branch if greater than or equal) = checks bits 1 and 2, if either bit is high, branch is taken
bgt (branch if greater than) = checks bit 1, if bit is high, branch is taken
ble (branch if less than or equal) = checks bits 0 and 2, if either bit is high, branch is taken
blt (branch if less than) = checks bit 0, if it is high, branch is taken
bne (branch if not equal) = checks bit 2, if bit is low, branch is taken
bng (branch if not greater than) = equivalent to ble
bnl (branch if not less than) = equivalent to bge
bns (branch if not summary overflow) = checks bit 3, if bit is low, branch is taken
bso (branch if summary overflow) = checks bit 3, if bit is high, branch is taken

The branch instruction checks the bits of the crF that is specified in the instruction.

Example: 'bge- cr7' checks GT and EQ bits of cr7



Chapter 4: Condition Register specific instructions

Before going into the CR specific instructions, we need to go over its 'format'. The 'format' of a typical CR instruction is this..

crXXX B, B, B #XXX = and, or, andc, orc, nor, xor, eqv

Under this format, you need to specify the exact bit of the entire Condtion Register. The problem with this is that it now becomes a memory game and you have to refer to the earlier CR bit table provided in Chapter 3. Instead of doing that non-sense, you can use this handy formula...

B = 4*crX+ZZ

X = field number (0 thru 7)
ZZ = lt, gt, eq, or so

With this formula, all you need to remember to which Field you want to use and what bit type. So now the easier-to-remember 'format' is this..

crXXX 4*crX+ZZ, 4*crX+ZZ, 4*crX+ZZ

---

CR Based Instructions:
  • Condition Register Logical OR~
    cror crfD, crfA, crfB #crfA bit is logically OR'd with crfB bit. Result is written to crfD bit.
  • Condition Register Logical AND~
    crand crfD, crfA, crfD #crfA bit is logically AND'd with crfB bit. Result is written to crfD bit.
  • Condition Register Logical NOR~
    crnor crfD, crfA, crfD #crfA bit is logically NOR'd with crfB bit. Result is written to crfD bit.
  • Condition Register Logical XOR~
    crxor crfD, crfA, crfD #crfA bit is logically XOR'd with crfB bit. Result is written to crfD bit.
  • Condition Register Logical EQV (XNOR)~
    creqv crfD, crfA, crfD #crfA bit is logically XNOR'd with crfB bit. Result is written to crfD bit. Technically, the instruction does a XOR of crfA with crfD, then this temp result is complemented, then writes that result to crfD.
  • Condition Register Logical AND with Complement~
    crandc crfD, crfA, crfD #crfA bit is logically AND'd with the complemented crfB bit. Result is written to crfD bit.
  • Condition Register Logical OR with Complement~
    crorc crfD, crfA, crfD #crfA bit is logically OR'd with the complemented crfB bit. Result is written to crfD bit.

Simplified Mnemonics:
  • Setting a bit high (set cr0 EQ high)~
    crset 4*cr0+eq #creqv 4*cr0+eq, 4*cr0+eq, 4*cr0+eq; crF bit is XNOR'd with itself and resutl written to same bit spot
  • Setting a bit low (set cr0 EQ low)~
    crclr 4*cr0+eq #crxor 4*cr0+eq, 4*cr0+eq, 4*cr0+eq; crF bit is XOR'd with itself and result written to same bit spot
  • Copy-Pasting (Moving) a bit (copy cr0 EQ bit to cr7 EQ bit's spot)
    crmove 4*cr7+eq, 4*cr0+eq #cror 4*cr7+eq, 4*cr0+eq, 4*cr0+eq; crF bit is Or'd with itself and result writen to crfD
  • Flip a bit (flip cr0 EQ bit and place result in cr7 EQ bit's spot))
    crnot 4*cr7+eq, 4*cr0+eq  #crnor 4*cr0+eq, 4*cr0+eq, 4*cr0+eq; crF bit is NOR'd with itself and result written to crfD

Also, the following instructions may be handy for you...
  • mfcr rD #Contents of the CR is copied to rD
  • mtcr rD #Contents of rD is copied to the CR
  • mcrf crD, crA #Condition Field A is copied to Condition Field D



Chapter 5: Cleaning up some Code

Let's go over some basic examples of some "CR trickery" to help clean up code. Some examples below won't shorten the source at all (will be same compiled length), but the amount of branches (plus label names) are reduced. This is accomplished by using multiple crF's and using Condition Register specific instructions.

Scenario 1:
If r4 = 1 and r10 = r31, then go to 'store_data'. Otherwise, go to 'dont_store'.

Typical Source
Code:
cmpwi r4, 1
bne- dont_store
cmpw r10, r31
beq- store_data

New Source
Code:
cmpwi r4, 1
cmpw cr7, r10, r31
crand 4*cr0+eq, 4*cr0+eq, 4*cr7+eq
beq- store_data

Scenario 2:
If r4 = 1 or r10 = r31, then go to 'store_data'. Otherwise, go to 'dont_store'.

Typical Source
Code:
cmpwi r4, 1
beq- store_data
cmpwi r10, r31
bne- dont_store

New Source
Code:
cmpwi r4, 1
cmpw cr7, r10, r31
cror 4*cr0+eq, 4*cr0+eq, 4*cr7+eq
beq- store_data

Scenario 3:
If r4 = 1 and r10 =/= r31, then go to 'store_data'. Otherwise, end_code

Typical Source
Code:
cmpwi r4, 1
bne- end_code
cmpw r10, r31
bne- store_data

New Source
Code:
cmpwi r4, 1
cmpw cr7, r10, r31
crandc 4*cr0+eq, 4*cr0+eq, 4*cr7+eq
beq- store_data

Scenario 4:
If r4 = 1 or r10 =/= r31, then go to 'store_data'.

Typical Source
Code:
cmpwi r4, 1
beq- store_data
cmpw r10, r31
bne- store_data

New Source
Code:
cmpwi r4, 1
cmpw cr7, r10, r31
crorc 4*cr0+eq, 4*cr0+eq, 4*cr7+eq
beq- store_data

Scenario 5:
If r4 = 1 then r10 must =/= r31, or if r4 =/=1 then r10 must = r31. If all requirments met go to 'store_data'. If not, go to end_code.

Typical Source
Code:
cmpwi r4, 1
bne- make_sure_next_true

#r4 = 1, r10 must =/= r31
cmpw r10, r31
bne- store_data
b end_code

#r4 =/= 1, r10 must = r31
make_sure_next_true:
cmpw r10, r31
beq- store_data

New Source
Code:
cmpwi r4, 1
cmpw cr7, r10, r31
crxor 4*cr0+eq, 4*cr0+eq, 4*cr7+eq
beq- store_data

Scenario 6:
If r4 = 1, then r10 must = r31. However r4 can =/= 1 as long as r10 =/= r31. If all requirements are met go to 'store_data'. If not, go to end_code.

Typical source
Code:
cmpwi r4, 1
bne- make_sure_next_false

#r4 = 1, r10 must = r31
cmpw r10, r31
bne- store_data
b end_code

#r4 =/= 1, r10 must =/= r31
make_sure_next_false:
cmpw r10, r31
bne- store_data

New source
Code:
cmpwi r4, 1
cmpw cr7, r10, r31
creqv 4*cr0+eq, 4*cr0+eq, 4*cr7+eq
beq- store_data



Chapter 6: Final Example

Let's say you have a value in r3 and it must be a valid Memory Address. Meaning a valid mem80, mem81, or mem9 address. If the address is not valid in any way, branch to the LR. An efficient way to write it would be like this (pretend r4 thru r7 are safe)...

Code:
lis r4, 0x8000 #0x80000000
lis r5, 0x817F #0x817FFFFF
ori r5, r5, 0xFFFF
addis r6, r4, 0x1000 #0x90000000
addis r7, r5, 0x1280 #0x93FFFFFF

cmplw r3, r4
cmplw cr5, r3, r5
cmplw cr6, r3, r6
cmplw cr7, r3, r7
cror 4*cr0+eq, 4*cr0+lt, 4*cr7+gt #Check if less than 0x80000000 ***or*** greater than 0x93FFFFFF; place result in cr0
crand 4*cr5+eq, 4*cr5+gt, 4*cr6+lt #Now check if its in between 0x817FFFC0 ***and*** 0x90000000; place result in cr5
cror 4*cr0+eq, 4*cr0+eq, 4*cr5+eq #If *any* of the two above conditions (cr0 and cr5) were true, branch to LR
beqlr-

And that's pretty much it. Happy coding!

Print this item

  All About Cache
Posted by: Vega - 05-15-2022, 12:59 PM - Forum: PowerPC Assembly - No Replies

All About Cache

This PPC tutorial will teach you the in's and out's of the Cache model of Broadway, it's instruction set, and how some of these instructions may need to be used for Gecko ASM Codes. This is a lengthy read, but every PPC Coder/dev should have a decent understanding of Broadway's Cache model.



Chapter 1: Understanding some Basics about Memory

There's two types of memory, Virtual & Physical. When Broadway executes in Virtual Memory, this is called Virtual Mode. When Broadway executes in Physical Memory, this is called Real Mode.

Virtual Memory is split into two categories:
  • Virtual Cached Memory
  • Virtual Uncached Memory

Virtual Cached Memory is your typical 'normal' memory that you are familiar with (i.e. 0x80000000 thru 0x817FFFFF & 0x90000000 thru 0x93FFFFFF).

Virtual Cached Memory is a representation of Physical Memory but it includes any cached content. Cached content may be 'old' or may be 'too new'. Therefore, what you see in Virtual Cached Memory may not be what is actually present in Physical Memory. Virtual Uncached Memory is a simple representation (copy) of Physical Memory.

Virtual Memory has to be split into Cached & Uncached so software always have the option to bypass cache.

Wii games won't run entirely in Real Mode due to lack of 'security'.

In Real Mode, all of memory has the same properties, and those properties cannot be adjusted from the Broadway default settings. With Virtual Mode, you can set different regions of memory to have a variety of different properties, and adjust said properties whenever you want.

Here's a list of Physical, Virtual Cached, and Virtual Uncached memory ranges for most Wii games.
  • 0x00000000 thru 0x017FFFFF Physical Mem1
  • 0x10000000 thru 0x13FFFFFF Physical Mem2
  • 0x80000000 thru 0x817FFFFF Virtual Cached Mem1 (known as mem80 and mem81)
  • 0x90000000 thru 0x93FFFFFF Virtual Cached Mem2 (known as mem9)
  • 0xC0000000 thru 0xC17FFFFF Virtual Uncached Mem1
  • 0xD0000000 thru 0xD3FFFFFF Virtual Uncached Mem2

The list doesn't include everything (like Hardware Memory), just the most common stuff that's relevant to Gecko ASM Codes.



Chapter 2: Structure of Cache Organization

There are two different cache systems in Broadway. L1 (Level 1) and L2 (Level 2). The L2 cache operates in a similar fashion but is larger. There's no need to deep dive into the intricasies of the L2 cache. The L1 cache will be the only cache unit covered about this this thread. The L1 Cache is split into two categories:
  • Data Cache
  • Instruction Cache

Instruction Cache is for anything that contains executable instructions, simple enough. Data Cache is for any data that are part of any load/store mechanism. Executable instructions can also be included in the Data Cache. For example, if you write (i.e. store) a new instruction to memory, it will be utilized by both the Instruction and Data cache.

Here's the layout of a Data Cache set/page (each row is known as a 'way')

Way0 | 32-byte Aligned Physical Address | StateBits | 8 Words
Way1 | 32-byte Aligned Physical Address | StateBits | 8 Words
Way2 | 32-byte Aligned Physical Address | StateBits | 8 Words
.. ..
Way6 | 32-byte Aligned Physical Address | StateBits | 8 Words
Way7 | 32-byte Aligned Physical Address | StateBits | 8 Words

The Instruction Cache implements the same layout, but it uses a single "Valid" Bit in place of the State Bits. Each Way will contain a 32-byte aligned physical address. Even though the address is physical, it is always translated to its Virtual Address for usage. "8 words" means the 8 words of data/instructions that are at the 32-byte aligned address. 8 words = 32 byte block. This block is known as a Cache Block. Since every address has to be 32-byte aligned, this means nothing smaller than a 32-byte aligned block of memory can have unique State/Valid Bits.

8 ways (Way0 thru 7) make up one 'Set'. There are a total of 128 Sets for both the Instruction and Data Cache. Both Caches are 32KB in size (32 bytes x 8 ways x 128 = 32,768 bytes = 32KB).

Since every cache block is 32-byte aligned, this means that you make a modification to the cache of let's say address 0x80001504, cache for the words of addresses 0x80001500 thru 0x8000151C will all be effected simultaneously.



Chapter 3: Cache Hits and Misses

It's crucial to understand that the Data Cache can only have new content added to it by store instructions. This includes any typical store instruction, but it also includes the dcbi, dcbz and dcbz_l instructions (these are treated as store instructions by Broadway). Content in the Data Cache is managed by a pseudo least-recently-used algorithm (aka PLRU).

The Instruction Cache gets content added to it by Broadway's Instruction Fetching mechanism only. It is impossible to control the Fetching mechanism directly. Therefore we cannot, at will, add in new content to the Instruction Cache. Just like the Data Cache, content in the Instruction Cache has its own PLRU.

The inner workings of the PLRU is not a concern for us Gecko Code creators. However, we do need to cover Cache Hits and Misses. Anyway, over time, the PLRU will fill instructions/data in the cache and later remove them so new data can use the Cache. The filling of the cache by the PLRU is usually referred to as 'pushing a block(s) onto the Cache'. We cannot change how the PLRU itself functions, but there are specific instructions we can do to forcefully edit Cache Blocks or push new blocks onto the Cache. This is covered in Chapter 5.

Whenever instructions/data is processed by Broadway, Broadway will check the L1 Cache (then the L2 Cache) to see if the specific memory address is in the Cache. If the address is present, this is known as a Cache Hit. If not, this is known as a Cache Miss. Cache misses severely degrade performance.



Chapter 4: State and Valid Bits

Each cache block (with it's 32-byte aligned physical address) in the Data Cache will have one of the following state bits with it.

State bits--
  • M = Modified
  • E = Exclusive
  • I = Invalid

Modified = Present in Virtual Cached Memory but not yet present on Physical Memory; will be written to physical memory sooner or later. When new blocks are placed into the Cache by the PLRU, they are tagged with M bit. Please note that PPC Manuals will rsometimes efer to a Data Cache block as "dirty" if it's tagged with the Modified (M) bit.
Exclusive = What's in Virtual Cached Memory is what's in Physical Memory. Please note that  PPC Manuals will sometimes refer to a Data Cache block as "clean" if it's tagged with the Exclusive (E) bit.
Invalid = Old data that is now invalid, you can freely erase/modify this block w/o effecting anything. When the PLRU updates Data Cache, only blocks that are tagged with the I bit qualify to be removed from the Cache.

Each physical address in the instruction cache has a valid bit associated with it
  • V = Valid
  • I = Invalid

Valid = next time this address is used by an instruction, the value here is what will be used
Invalid = old data that is now invalid, will not be used, can be tossed whenever. When the PLRU updates Instruction Cache, only blocks that are tagged with the I bit qualify to be removed from the Cache.



Chapter 5: List of Cache Instructions

Broadway comes with the following cache instructions~

dcbf rD, rA = Data Cache Block Flush
dcbi rD, rA = Data Cache Block Invalidate
dcbst rD, rA = Data Cache Block Store
dcbt rD, rA = Data Cache Block Touch
dcbtst rD, rA = Data Cache Touch for Store
dcbz rD, A = Data Cache Block Zero
dcbz_l rD, rA = Data Cache Block Zero then Lock
icbi rD, rA = Instruction Cache Block Invalidate

All instructions treat their values as Signed.
rD + rA = The address (aka Effective Address aka EA)

Note that in all instructions, if rD = r0, it will be treated as literal zero.
  • dcbf For cache hits, if the block has a M bit, the data in the block is now written to physical memory and an I bit replaces the M bit. If the block has an E bit, the bit is simply changed to I. For cache misses, no action is taken. Therefore you can use dcbf as a way to "erase" the Cache but make sure memory gets updated before Cache is erased. For "erasing" Cache without updating memory, refer to dcbi.
  • dcbst For cache hits, if the block has an E bit or I bit, no action is taken. If the block has a M bit, the data in the block is written to physical memory and the bit is changed to E. For cache misses, no action is taken. Pro-Tip: If you are familiar with BAT Registers and the region of memory is in a BAT that is marked at 'Write-Through (W bit high), you will never need the dcbst instruction for that region of memory, but performance of Broadway will be degraded.
  • dcbi For cache hits, the state bit is always changed to I, regardless of what is was before. If the state bit was M, data that was going to be written to physical memory is now discarded. For cache misses, no action is taken. Therefore, dcbi can be used to "erase" the Cache and prevent the Cache from updating physical memory.
  • dcbt This is used to give the Cache system a hint that an upcoming Load instruction needs to have its 32-bit aligned Address pushed onto the cache. Thus, this is only useful if you know the Load instruction will end up as a Cache Miss. For cache hits, no action is taken. Improper usage of this instruction (too many Cache hits) will degrade performance.
  • dcbtst This is used to give the Cache system a hint that an upcoming Store instruction needs to have its 32-bit aligned Address pushed onto the cache. Thus, this is only useful if you know the Store instruction will end up as a Cache Miss. For cache hits, no action is taken. Improper usage of this instruction (too many Cache hits) will degrade performance.
  • dcbz For cache hits, the contents of the Block (virtual memory) are zero'd, and state bit changed to M. For cache misses, a new block using the address referenced by the dcbz is pushed onto the Cache, then the Cache Block is zero'd (regardless of what is present in Physical Memory) & tagged with the M bit. 
  • dcbz_l does the same as above, but will then lock the cache where it can't be modified. This instruction is only legal when the Locked Cache (via HID2) is enabled, otherwise an exception will occur.
  • icbi is the only instruction you have available to modify the instruction cache. For cache hits, the block is set to Invalid. For cache misses, no action is taken.

As mentioned in Chapter 3, dcbi, dcbz and dcbz_l are treated as store instructions. All other cache-related instructions are treated as load instructions. In conclusion, there are no cache-related instructions to force any updates (add new Blocks) to the Instruction Cache.



Chapter 6: Overwriting Executable Instructions

For Gecko ASM Codes, the only instance where we really need to worry about cache is if your code involves writing/re-writing new instructions that will be executed later on.

When overwriting instructions, you need to ensure they get updated in physical memory before Broadway fetches them for execution. Or else there's a chance the instructions fetched will be the old instructions.

Here's a template for updating cache for writing in new executable instructions

Code:
#rX = points to memory address of newly written executable instruction
dcbst 0, rX
icbi 0, rX
isync
  • dcbst 0, rX = This will force the block (if M bit tagged) to be written to physical memory. State bit changed to E.
  • icbi 0, rX = The old instruction may still be present (and marked Valid) in the Instruction Cache. Therefore, we tag it as Invalid
  • isync = Broadway is an out-of-order execution CPU like any other modern CPU. Even with the icbi instruction, it's possible Broadway still fetched the older instruction. This isync instruction will force Broadway to purge its current fetched instructions and refetch. Thus forcing the new instruction to be fetched.

You do **NOT** need to 32-byte align the address (i.e. 0x8000151C -> 0x80001500) for rX when using the above example source. Broadway will handle that for you.

You also do **NOT** need to include the isync if the first newly written instruction is at least 5 will-be-executed instructions ahead of the icbi. This is because Broadway can only fetch up to 4 instructions at a time.

Fyi: If using the above snippet in a loop mechanism, you only need an isync at the end. Do not place it inside the loop. Also remember that Cache Blocks are 32-byte aligned. Therefore your address incrementation amounts for load and store instructions (in your loop) should be incrementing by 32.



Chapter 7: In-Depth Explanation

To explain the entirety of why we need..

dcbst
icbi
isync

...for the case of rewriting in new instructions, we need to cover some complex aspects of Broadway that you may not be familiar with.

First understand that all Wii games configure virtual regions of memory via what a mechanism called BAT registers.

We don't need to worry what the BAT registers are exactly and how to use them. Just understand that all of usable physical memory is mapped twice virtually, once for data and once for instructions. (For more info on BATs, read this thread HERE)

Thus we have two virtual copies of the same physical memory. It's important to understand that there is no 'built-in' mechanism by Broadway that ensures these two copies of memory always match each other. That is required by software (the program/game/codes/whatever).

The virtual memory that is used for Data is configured as "Write-Back" and "Cache-Enabled".
  • Write-Back = store operations update the cached memory, but do not instantly update physical memory
  • Write-Through = store operations update cached memory, plus updating physical memory. performance is degraded.

Since the virtual memory for Data is also cache-enabled, it is referred to as Virtual Cached Memory. Therefore this memory includes all contents of the Data Cache.

The virtual memory for Instructions is also configured as Cache-Enabled (Write-Back/Through is not applicable here).

Anyway since Virtual Cached Memory, for the use of Data, is in Write-Back Mode, this presents a problem for Instruction execution. It can create scenarios where the Instruction Cache is "seeing" a different virtual memory copy than the Data Cache is "seeing".

It's important to understand how Broadway fetches instructions for execution. The fetching mechanism will hit a virtual address, translate it to its physical address equivalent, and then search various units for the address's instruction. Broadway searchs the following places...
  • L1 Instruction Cache
  • L2 Instruction Cache
  • Physical Memory (may also be called System or Main memory in various manuals/websites)

Broadway checks the L1 Cache first. If the address isn't present there, it will then check the L2 Cache. If not present in the L2 Cache, physical memory is finally checked.

For Cache hits, Broadway will then check the address's valid bit in the cache. If the valid bit is set, Broadway will use the instruction that is currently present in Virtual Cached Memory (the memory that the Instruction Cache "see's"). If the invalid bit is instead set, Broadway will directly go to physical memory for the instruction, bypassing the L2 check if necessary.

Keep in mind that L1 and L2 cache are 'synced', whatever is in the L1 Cache is ALWAYS present in the L2 Cache. This is possible due to L2 being larger than L1.

Now that you understand how instruction fetching works, we need to cover the 'under the hood' stuff of store and load instructions via virtual cached memory.

So let's say you have any plane jane basic store instruction (i.e stw), that stores to plain jane virtual cached memory. Welp after that store has executed, the physical address will be pushed onto the  Data cache and the data itself is written at the virtual cache memory address.

Now let's say you then execute a load instruction (i.e lwz) as the very next instruction. Obviously, what you just stored using stw is what will be loaded via lwz. That's because the previous store updated the Data Cache (with a new Block), therefore the load instruction will recieve a Cache Hit and the contents to load is retrieved from the Data Cache (virtual cached memory).

Now let's say you store over an instruction, the only changes that instantly occur is in the Data Cache which would be the Virtual Cached Memory that the Data Cache "see's". Physical memory doesn't update instantly since the memory in question is under Write-Back mode. Thus, the next time the new instruction is fetched, the old instruction will most likely be used instead.

Why is this?

This is because the newly written instruction won't be in the Instruction Cache's L1 + L2 meaning it's not present in the virtual memory that the Instruction Cache "see's". It will also not be present in Physical Memory.

The utilization of the dcbst instruction will force the newly written instruction to be also written to Physical Memory.  However this instruction alone isn't enough. It is possible the old instruction is currently in the Instruction L1/L2 Cache with being marked as Valid. Meaning the instruction fetching mechanism won't even bother checking Physical Memory since the L1/L2 cache is basically saying "Hey we have the instruction! And it's valid! No need to check physical memory!"

Therefore to alleviate this possible problem, we use the icbi instruction to mark the old instruction in the L1/L2 cache as invalid. If the old instruction isn't in the Instruction Cache, then the icbi has zero effect (like a nop). The isync is needed just in case the old instruction was fetched. It will cause Broadway to re-fetch instructions again so now the new instruction is guaranteed to be fetched. As mentioned earlier, an isync is not required if the modified new instruction is at least 5+ would-be-executed instructions ahead of the icbi instruction.

In conclusion, these three instructions (dcbst, icbi, isync) will always ensure that your newly written instructions are always executed.

Still confused? Here's a picture:

[Image: cache.png]

rX = New Instruction to write
rY = Address in question

Yellow font shows the changes invoked by the respective instruction.

Instruction is the instruction that HAS executed. Regarding Fetcher Status, it gives you a basic summary of what is happening in regards to the Fetcher and the Instruction Queue. When Instructions are placed into the Queue, they also placed into the I-Cache (if not present beforehand), and then marked as Valid.

'Not Present most likely' for rY D-Cache means that its very unlikely that rY (with its cache block data) is already present in the D-Cache. Even if it is, we would have no idea (given the information from the diagram) of what its state bits would be.

'0x38000000 possibly' for rY I-Cache means we have no idea (give the information) if rY (with its cache block data) is present (regardless of Valid vs Invalid) in the I-Cache.

In conclusion, these three instructions (dcbst, icbi, isync) will always ensure your newly written instructions are visible to the instruction fetching mechanism.

Final Note on this Chapter~

What about the usage of the sync instruction (after using dcbst)?

It is not needed. The usage of sync is only required if we are sharing memory with another processor, or I/O chip (i.e. Starlet). Our instruction re-writes only need to be 'seen' by Broadway, that's it. Therefore, sync is not required.



Chapter 8: Some neat tricks with Cache instructions

This chapter will contain some snippets of code to show some neat tricks you can do with the Cache. All tricks are sources meant to be compiled as C0 Gecko Codes. Fyi,these tricks will only work on a regular Wii Console.

---

Trick #1: Write word 1 to memory, load it back, and it will be a different value (0) than what was just stored

Summary:
Write null word to virtual address 0x80001500
Flush the block, so we know its written to physical 0x00001500, and therefore the block is now left invalid
Write 1 to virtual address 0x80001500
Load word from physical (0xC0001500 which is direct physical copy of 0x00001500)
If value is *NOT* 1 (aka 0), game will light up disc drive to show success

Code:
#Disable INTs; not done correctly! Do not copy this for your regular cheat codes!
mfmsr r3
rlwinm r12, r3, 0, 17, 15
mtmsr r12

#Set r12 to 0x80000000
lis r12, 0x8000

#Make sure value at 0x80001500 is null beforehand
li r11, 0
stwu r11, 0x1500 (r12)

#Make sure null is also written to the physical memory
dcbf 0, r12

#Set value of 1
li r11, 1

#Set r10 as pointer to start of uncached memory
lis r10, 0xC000

#Store 1 to Virtual Cache memory
#This store will now push r12 on to the Cache assigned with the M bit.
#Fyi: Anything that has the M bit has not been sent to physical memory yet
stw r11, 0 (r12)

#Load up from uncached memory (exact copy of physical)
lwz r11, 0x1500 (r10)

#Check if r11 = 1. If not, light up disc drive
cmpwi r11, 1
beq- restore_ints

#Disc Drive
lis r12, 0xCD00
lwz r0, 0x00C0 (r12)
ori r0, r0, 0x0020
stw r0, 0x00C0 (r12)

#Restore INTs. #Copy current MSR into r12
#Not done correctly! Do not copy this for your regular cheat codes!
restore_ints:
mfmsr r12

#Insert r3's EE bit into r12, overwriting r12's EE bit
rlwimi r12, r3, 0, 16, 16

#Update MSR
mtmsr r12

#End C0
#blr

---

Trick #2: Write zero to memory without using regular store instructions (this will actually write zero to an entire 32-byte aligned block). This isn't really a 'trick' per say since the dcbz instruction is suppose to behave in such a manner, but you get the idea.

Summary:
Write 1 to virtual address 0x80001500
Make block exclusive (via dcbst) to force update to physical memory
Do a temp load to prove value the word value at physical address is 1
Zero the cache block
Force block to be written to physical (via dcbst)
Load value from physical memory
It will equal 0 (disc drive lights up)

Code:
#Disable INTs; not done correctly! Do not copy this for your regular cheat codes!
mfmsr r3
rlwinm r12, r3, 0, 17, 15
mtmsr r12

#Write 1 to 0x80001500
lis r12, 0x8000
li r11, 1
stwu r11, 0x1500 (r12)

#Make sure updates are in physical memory, keep block exclusive
dcbst 0, r12

#At this moment, physical addr 0x00001500 = 1. We will verify this. If not true, skip lightning up disc drive
lis r10, 0xC000
lwz r11, 0x1500 (r10)
cmpwi r11, 1
bne- restore_ints

#Zero out cache block now, block now tagged with M
#This instruction zero's out the data for the block in cached memory
dcbz 0, r12

#However, let's make sure the changes also go the physical memory
dcbst 0, r12

#Now load value from physical, if null disc drive will light up
lwz r11, 0x1500 (r10)
cmpwi r11, 0
bne- restore_ints

#Disc Drive
lis r12, 0xCD00
lwz r0, 0x00C0 (r12)
ori r0, r0, 0x0020
stw r0, 0x00C0 (r12)

#Restore INTs. #Copy current MSR into r12
#Not done correctly! Do not copy this for your regular cheat codes!
restore_ints:
mfmsr r12

#Insert r3's EE bit into r12, overwriting r12's EE bit
rlwimi r12, r3, 0, 16, 16

#Update MSR
mtmsr r12

#End C0
#blr

---

Trick #3: Write value of 1 to virtual, then immediately write 2 to physical afterwards. However with some cache trickery, when we load the word value from physical memory, it will be the stale value of 1.

Summary:
Flush block at 0x80001500
Write 1 to 0x80001500
Write 2 to 0xC0001500 immediately afterwards
dcbst on cache block to overwrite the 2 with the earlier value of 1
Load value from 0xC0001500
It will be 1 (not 2) and disc drive will light.

Code:
#Disable INTs; not done correctly! Do not copy this for your regular cheat codes!
mfmsr r3
rlwinm r12, r3, 0, 17, 15
mtmsr r12

#First flush the block to make sure it cannot be in the exclusive state after our write
lis r12, 0x8000
ori r12, r12, 0x1500
dcbf 0, r12

#Set r10 to 0xC000
lis r10, 0xC000

#Set values 1 and 2 in their registers
li r11, 1
li r9, 2

#Write 1 to 0x80001500 first!
#Then Write 2 to 0xC0001500 (physical)
stw r11, 0 (r12)
stw r9, 0x1500 (r10)

#Force cache block at 0x80001500 to update to physical memory now
#This will overwrite the newly written value of 2 present at 0x00001500/0xC0001500
dcbst 0, r12

#Now Load from physical & check. If 1 (old value), disc drive will light up
lwz r11, 0x1500 (r10)
cmpwi r11, 2
beq- restore_ints

#Disc Drive
lis r12, 0xCD00
lwz r0, 0x00C0 (r12)
ori r0, r0, 0x0020
stw r0, 0x00C0 (r12)

#Restore INTs. #Copy current MSR into r12
#Not done correctly! Do not copy this for your regular cheat codes!
restore_ints:
mfmsr r12

#Insert r3's EE bit into r12, overwriting r12's EE bit
rlwimi r12, r3, 0, 16, 16

#Update MSR
mtmsr r12

#End C0
#blr




And that's it for Cache, happy coding!

Print this item

  Draggable items code
Posted by: dirtyfrikandel - 05-12-2022, 11:43 PM - Forum: Code Support / Help / Requests - Replies (1)

Hey. 

I need some help with creating a code that lets you drag any item, just like the draggable blueshell mod (https://mariokartwii.com/showthread.php?...=draggable)

For example to drag a star i've tried the following:

1) Item Behaviour Modifier, https://mariokartwii.com/showthread.php?tid=386) but it also activates the power before dragging it.
048A61B8 00000002

2) I changed this code to the following (taken the PAL code for star from Item Behaviour Modifier):

068A61B8 00000008
00000002 00000000. 
This does not activate the power up when pressing the item button (nice!), and the item actually drags behind the vehicle. However, when pressing the button again, it drops the item on the ground (like a banana)


I want to drag things like POW, Shocks, Bloopers, Megas in order to simulate a mario kart 8 Deluxe where you have 2 item slots.

Is this even possible as these items have a standard behaviour value of 00 instead of 01, like the blueshell/red shell/green shell (https://wiki.tockdom.com/wiki/Filesystem...r_Modifier)

Kind regards,
Dirtyfrikandel

Print this item

  Gecko Region Code Porter
Posted by: dirtyfrikandel - 05-12-2022, 03:56 PM - Forum: Resources and References - Replies (1)

Hey,

Does anyone have either of the following software:

  • Bean's auto porter
  • Bully@WiiPlaza's Gecko Code Porter

I cant seem to find it anywhere online. Bully's website links are dead.

Kind regards,
Dirtyfrikandel

Print this item

  setting layout message id at runtime
Posted by: jawa - 05-10-2022, 08:32 PM - Forum: Code Support / Help / Requests - Replies (2)

is it possible to change a text box message id at runtime? for example:
Luigi Circuit - ID 320 for example
and i want it to change to
Another circuit - ID 321

Print this item

  EVA Usage List
Posted by: Vega - 05-04-2022, 01:30 PM - Forum: Resources and References - Replies (1)

EVA Usage List

Some codes use the area adjacent of the Exception Vectors. This area is commonly referred to as simply the 'EVA'. It is segments of unused regions of memory that are universal for every Wii game. Some code creators will utilize these regions of memory to transfer data between multiple codes.

Therefore, care must be taken so two different codes don't end up using the same spot(s) in the EVA. Further below is a list of codes that use the EVA along with their used EVA addresses.

You will notice that there are some instances where multiple codes use the same EVA addresses. Regarding MKWii, there are other unused regions of memory that are used by codes as well (i.e. mem81). If such codes use those space(s), they are also included.

Some codes are listed more than once because they use 2 or more different regions of the EVA/Mem81/etc. There's a good chance I missed some codes that need to be on the list. Please let me know if you find any missing codes, missing EVA usage of code(s), or other mistakes. Thank you.



EVA:
0x80000298 thru 0x8000029A Auto Attempt to Use Dodge Item When Shocked -Online- [Vega] https://mariokartwii.com/showthread.php?tid=1409
0x800002CE thru 0x800002D3 FanCy Speedometer [JoshuaMK] https://mariokartwii.com/showthread.php?tid=1492
0x800002CE thru 0x800002D3 FanCy HUD [JoshuaMK] https://mariokartwii.com/showthread.php?tid=1496
0x800003B0 thru 0x800003BF Memory Editor [Vega] https://mariokartwii.com/showthread.php?tid=1346
0x800003FF XYZ Position Swapper [Vega] https://mariokartwii.com/showthread.php?tid=1728
0x800004C0 Random Character but w/ Best Bike For Every Race Online [Vega]  https://mariokartwii.com/showthread.php?tid=1412
0x800004C0 and 0x800004C1 Random Character+Vehicle Combo For Every Race -Online- [Vega] https://mariokartwii.com/showthread.php?tid=1410
0x800005FD Rapid Fire (GCN) [mdmwii] https://mariokartwii.com/showthread.php?tid=263
0x800007B4 thru 0x800007C8 Custom Item Probability Distribution Offline [Sioist] https://mariokartwii.com/showthread.php?tid=1882
0x800007C0 thru 0x800007C3 File Encryptor/Decryptor [Vega / 1superchip] https://mariokartwii.com/showthread.php?tid=1612
0x800007D0 thru 0x800007FF Change Mii Names In Between Online Races [Vega] https://mariokartwii.com/showthread.php?tid=1576
0x80000A20 thru 0x80000A23 Steal-Mii [Vega] https://mariokartwii.com/showthread.php?tid=1218
0x80000DC0 thru 0x80000DCF FanCy Speedometer [JoshuaMK] https://mariokartwii.com/showthread.php?tid=1492
0x80000DC0 thru 0x80000DCF FanCy HUD [JoshuaMK] https://mariokartwii.com/showthread.php?tid=1496
0x80000DD0 thru 0x80000E7A Memory Editor [Vega] https://mariokartwii.com/showthread.php?tid=1346
0x80000F98 thru 0x80000FBF Clock [Vega] https://mariokartwii.com/showthread.php?tid=1194
0x80000FA0 thru 0x800012FF FanCy Speedometer [JoshuaMK] https://mariokartwii.com/showthread.php?tid=1492
0x80000FA0 thru 0x800012FF FanCy HUD [JoshuaMK] https://mariokartwii.com/showthread.php?tid=1496
0x80000FC8 thru 0x80000FCF Luck Wheelie Bot [Vega] https://mariokartwii.com/showthread.php?tid=1861
0x80000FCC thru 0x80000FCF Rapid Fire/Hop [Vega] https://mariokartwii.com/showthread.php?tid=1862
0x80001100 thru 0x80001197 Vehicle Stats Modifier [JoshuaMK] https://mariokartwii.com/showthread.php?tid=1334
0x8000149E and 0x8000149F Random Track Selection For Offline [Vega] https://mariokartwii.com/showthread.php?tid=1118
0x800014B0 thru 0x80001503 Draw Text To Screen [SwareJonge] https://mariokartwii.com/showthread.php?tid=1714
0x800014B0 thru 0x8000152F Graphical Speedometer [SwareJonge] https://mariokartwii.com/showthread.php?tid=1715
0x80001500 thru 0x80001518 Graphical In-Game Item Spy Online [Vega] https://mariokartwii.com/showthread.php?tid=1116
0x80001500 thru 0x80001518 Graphical In-Game Item Spy Offline [Vega] https://mariokartwii.com/showthread.php?tid=1115
0x80001500 thru 0x8000152F USB Gecko Item Spy -Offline- [NoHack2Win] https://mariokartwii.com/showthread.php?tid=568
0x8000152C thru 0x8000152F Mii Cloner [mdmwii]  https://mariokartwii.com/showthread.php?tid=274
0x80001534 thru 0x80001537 Future Fly (Wii Chuck) [mdmwii]  https://mariokartwii.com/showthread.php?tid=1078
0x80001534 thru 0x80001537 Future Fly (GCN) [mdmwii] https://mariokartwii.com/showthread.php?...6&pid=1923
0x80001534 thru 0x8000153F Stalking [mdmwii]  https://mariokartwii.com/showthread.php?tid=558
0x80001550 thru 0x80001553 AutoPilot [mdmwii] https://mariokartwii.com/showthread.php?tid=768
0x80001550 thru 0x80001553 MarioBOT [mdmwii] https://mariokartwii.com/showthread.php?tid=54
0x80001570 thru 0x8000157B 1st Person Camera View [JoshuaMK, mdmwii] https://mariokartwii.com/showthread.php?tid=1331
0x80001570 thru 0x8000157F 1st Person Camera View [mdmwii] https://mariokartwii.com/showthread.php?tid=597
0x80001574 thru 0x80001577 Random Item From Item Box [Star] https://mariokartwii.com/showthread.php?tid=540
0x80001584 thru 0x80001587 Satellite Camera [mdmwii]  https://mariokartwii.com/showthread.php?tid=616
0x800015A0 thru 0x800015D3 Dump All Opponents' IP Address & Important USER Record Info to NAND [Vega] https://mariokartwii.com/showthread.php?tid=1086
0x80001600 thru 0x8000161B Future Fly (Wii Chuck) [mdmwii] https://mariokartwii.com/showthread.php?tid=1078
0x80001600 thru 0x8000161F Future Fly (GCN) [mdmwii] https://mariokartwii.com/showthread.php?...6&pid=1923
0x80001620 thru 0x80001627 DWC_Authdata NAND File Modifier [Vega] https://mariokartwii.com/showthread.php?tid=1075
0x80001624 thru 0x80001627 Item Wheel Hack [Bully] https://mariokartwii.com/showthread.php?tid=817
0x80001640 thru 0x80001643 Set Wii To Shutdown After Specific Time [Vega] https://mariokartwii.com/showthread.php?tid=1009
0x80001648 thru 0x8000164B Ultimate Region ID Cycler In Between Races [Vega] https://mariokartwii.com/showthread.php?tid=1070
0x80001660 thru 0x80001663 Speed-O-Meter; TTs Only [mdmwii] https://mariokartwii.com/showthread.php?tid=1040
0x80001660 thru 0x80001663 Graphical Speed/MTC/Air/Boost Meter [Vega] https://mariokartwii.com/showthread.php?tid=1112
0x80001660 thru 0x80001663 Graphical Speed-O-Meter [Vega] https://mariokartwii.com/showthread.php?tid=1111
0x80001660 thru 0x80001689 Graphical SpeedBar [Vega] https://mariokartwii.com/showthread.php?tid=1113
0x80001680 thru 0x80001693 Item Stalking [Bully] https://mariokartwii.com/showthread.php?tid=549



Mem80 Non-EVA:
0x80001808 thru 0x8000180B Copy Mii [Anarion] https://mariokartwii.com/showthread.php?tid=620



Mem81:
0x8140179A thru 0x8140179D Custom Laps [JoshuaMK] https://mariokartwii.com/showthread.php?tid=1169
0x81430000 thru 0x81430003 Call POW Function Anytime [Vega] https://mariokartwii.com/showthread.php?tid=1353
0x81430004 thru 0x81430007 Force Shock Damage [1superchip] https://mariokartwii.com/showthread.php?tid=1536
0x81490000 thru 0x81490003 Dump All Opponents' IP Address & Important USER Record Info to NAND [Vega] https://mariokartwii.com/showthread.php?tid=1086
0x814D0000 thru 0x814D0003 Race as Ghost on Ghost Replay [Vega] https://mariokartwii.com/showthread.php?tid=1372
0x81500000 thru 0x8150002F USB Gecko Item Spy Offline [Bully] https://mariokartwii.com/showthread.php?tid=628
0x81500000 thru 0x8150002F Graphical In-Game Item Spy Offline [Vega] https://mariokartwii.com/showthread.php?tid=1115
0x81500000 thru 0x8150002F Graphical Item Warning Offline [Vega] https://mariokartwii.com/showthread.php?tid=1068
0x81500000 thru 0x8150003F Universal Meter [SwareJonge] https://mariokartwii.com/showthread.php?tid=990
0x81510000 thru 0x81510003 Graphical In-Game Item Spy Offline [Vega] https://mariokartwii.com/showthread.php?tid=1115
0x81510000 thru 0x81510003 Graphical Item Warning Offline [Vega] https://mariokartwii.com/showthread.php?tid=1068
0x815F0000 thru 0x815F0003 Graphical SpeedBar [Vega] https://mariokartwii.com/showthread.php?tid=1113
0x815F0000 thru 0x815F0003 Graphical Speed/MTC/Air/Boost Meter [Vega] https://mariokartwii.com/showthread.php?tid=1112
0x815F0000 thru 0x815F0003 Graphical Speed-O-Meter [Vega] https://mariokartwii.com/showthread.php?tid=1111
0x81650000 thru 0x8165000B Stalking -Online- [Bully]  https://mariokartwii.com/showthread.php?tid=767
0x81650000 thru 0x8165002F In Game Item Spy -Online- [Bully] https://mariokartwii.com/showthread.php?tid=561
0x81660000 thru 0x8166000B Stalking -Online- [Bully]  https://mariokartwii.com/showthread.php?tid=767
0x81660000 thru 0x8166002F Graphical In-Game Item Spy Online [Vega] https://mariokartwii.com/showthread.php?tid=1116
0x81660000 thru 0x8166002F Graphical Item Warning Online [Vega] https://mariokartwii.com/showthread.php?tid=1069
0x81660088 thru 0x81660097 Stalking -Online- [Bully]  https://mariokartwii.com/showthread.php?tid=767
0x81670000 thru 0x81670003 Graphical In-Game Item Spy Online [Vega] https://mariokartwii.com/showthread.php?tid=1116
0x81670000 thru 0x81670003 Graphical Item Warning Online [Vega]  https://mariokartwii.com/showthread.php?tid=1069
0x81680192 thru 0x8168019D USB Gecko Item Spy -Online- [Star] https://mariokartwii.com/showthread.php?tid=541
0x81680193 thru 0x8168019E USB Gecko Item Spy Online [Bully] https://mariokartwii.com/showthread.php?tid=629
0x81700000 thru 0x8170002F USB Gecko Grab Everyone's Location Offline [Bully] https://mariokartwii.com/showthread.php?tid=765
0x81700040 thru 0x8170006F USB Gecko Grab All Item Locations Offline [Bully] https://mariokartwii.com/showthread.php?tid=766
0x81700070 thru 0x8170009F USB Gecko Item Spy Offline [Bully] https://mariokartwii.com/showthread.php?tid=627
0x81700F00 Camera Toggle [JoshuaMK] https://mariokartwii.com/showthread.php?tid=1159



Mem93:
0x9370053C and 0x9370053D Luck Wheelie Bot (Wheel/Chuck Only) [Bully] https://mariokartwii.com/showthread.php?tid=239

Print this item