Demonstrating the mishandling of FPRs by Dolphin

Below is a code that will show an output on your screen. Due to how Dolphin doesn't utilize the FPRs 100% correctly compared to how real hardware utilizes them, the code will produce different results on console vs Dolphin. If you are familiar with floating point and paired single stuff in PPC ASM, read the included source below to understand what this code is doing in technical terms.

Code is PAL only, start a race, pick up a box.

C27BA164 00000013
3FA08000 3C00BB0F
6000B824 9001FFFC
63DE5555 3FE05555
63FF5555 BFC1FFF8
C821FFF8 FC000890
100004A0 D81D1500
80BD1500 80DD1504
48000011 25303858
20253038 58000000
7C8802A6 387D1540
4CC63182 3D808001
618C1A2C 7D8903A6
4E800421 4800000D
7C6802A6 38830004
38BD1540 3D80801A
618C4EC4 7D8903A6
4E800420 00000000

The code manipulates a simple bug of dolphin leaving 'leftovers' in ps1 in the destination register during the execution of an fmr instruction if the source register contains a double precision float. 

Whenever a float is double precision in the source register of an fmr instruction, Broadway will copy the entire fpr over to the destination register. But because Dolphin runs FPRs as a two 64-bit segments, this does not exactly occur. Thus, we get 'leftovers' in ps1 which we can then manipulate.

Nothing to be concerned about as this bug can only botch poorly handwritten assembly. Thus it won't effect how Wii games run on Dolphin. You can also reproduce this bug with some other instructions as well such as lfd.

#PAL = 807BA164

#BF61F70480000000 #Value being placed into f0
#3FD5555555555555 #Value being placed into f1

#Set r29 as 0x8000 for eva usage
lis r29, 0x8000

#Write the single precision float
lis r0, 0xBB0F
ori r0, r0, 0xB824
stw r0, -0x4 (sp)

#place it in f0 (lfs places it in both ps segments)
lfs f0, -0x4 (sp)

#Write the double precision float; use r30 so we can use stmw
lis r30, 0x3FD5
ori r30, r30, 0x5555
lis r31, 0x5555
ori r31, r31, 0x5555
stmw r30, -0x8 (sp)

#Load the double float into f1
lfd f1, -0x8 (sp)

#On real hardware the entire f1 will simply copy over to f0, but on dolphin (due to having fprs as two 64-bit segments) this will not occcur and ps0 of f0 will be a double float while preserving ps1 of f0 from earlier (what was loaded from the lfs instruction)
fmr f0, f1

#Swap the ps segments to show dolphin will utilize the leftover ps1 of f0 which it shouldnt.
ps_merge10 f0, f0, f0

#Store the entire fpr to memory
stfd f0, 0x1500 (r29)

#Load the words into r5 and r6 for sprintf args
lwz r5, 0x1500 (r29)
lwz r6, 0x1504 (r29)

#Set r4 arg of sprintf
bl setupsprintf
.string "%08X %08X"
.align 2
mflr r4

#Set r3 arg of sprintf
addi r3, r29, 0x1540

#Clear cr1's eq bit to tell sprintf there are no args in the fprs to use
crclr 6

#Call sprintf
lis r12, 0x8001
ori r12, r12, 0x1A2C
mtctr r12

#Setup OSFatal args
bl setupfatal
.long 0xFFFFFFFF #text color; white
.long 0x000000FF #bg color; black
mflr r3
addi r4, r3, 4
addi r5, r29, 0x1540

#Call OS Fatal
lis r12, 0x801A
ori r12, r12, 0x4EC4
mtctr r12

Screen results
What shows up on screen on Dolphin (incorrect) - BF61F704 80000000
What shows up on console (correct) - 3FE00000 00000000

I have no idea how the value of 0.5 shows for the console, I was expecting some crazy value or NaN since a double float is being ps swapped. If anybody knows why, please let me know. Couldn't find any info with a quick glance into the Broadway manual. It's definitely some sort of deliberate mechanism the Float Point Unit (FPU) is doing.
Has this been reported to the Dolphin developers already? Who knows, there might still be obscure games using code like that. Or is it 100% guaranteed that no compiler will emit code like this?
This exact bug has not been reported afaik. But the Devs have been aware that Dolphin's quirky usage of the FPRs does not 100% replicate the Wii's hardware. I cannot think of a way on how a compiler can produce botchy/improper Assembly (like the source used in the code) to replicate this bug. Which, in my opinion, is the reason why the Devs have never got rid of the FPR quirks.

Forum Jump:

Users browsing this thread: 1 Guest(s)