Mistake in the Broadway Manual regarding frsqrte Vega bl the_admin Posts: 4,123 Threads: 868 Joined: Feb 2018 Reputation: 95 04-14-2024, 11:58 PM (This post was last modified: 04-15-2024, 10:00 PM by Vega.) Most coders here are familiar with the Broadway Manual. If anybody has also read through it, you will find it littered with diagram mistakes and typoes. Considering it's a preliminary file that was never meant for the public, this is sort of expected. However, I came across a decent mistake (or information left missing) regarding the description of the frsqrte instruction. I've been working on a Broadway PPC Instruction Simulator recently and have to make sure it's as accurate as possible. So I've been combing thru the manual at times. According to the Broadway Manual (page 426), the frsqrte instruction does this~ 1. frB used as input (64-bit input) 2. Reciprocal of Square Root Estimate is performed (within 1/4096 accuracy) 3. frD gets result as 64-bit What's odd is that there is no mention about how the instruction operates in regards to the HID2 PSE bit. I've tried reading other Chapters, and there's nothing explicit. As an fyi, for every single-float instruction (except for fabs, fneg, and fnabs), the operation of said instruction varies depending on whether PSE bit is low or high. For example, the fmr instruction. When HID2 PSE is low 1. Entire Value of frB is copied to frD When HID2 PSE is high 2. ps0 of frB copied to ps0 of frD. ps1 of frD is left UNCHANGED. Another example (fdivs) When HID2 PSE is low 1. Entire value of frA is divided by entire value of frB 2. Result placed in frB as 64-bit form but with single precision ofc When HID2 PSE is high 1. ps0 of frA is divided by ps0 of frB 2. result placed into BOTH ps0 and ps1 of frD Basically in all the single-float math-type instructions, when HID2 PSE is high, ps0 is input and ps0 + ps1 is output. As you can see above, the fmr instruction is an old ball with frD ps1 being unchanged. The frsp instruction is another odd ball to where the frD's ps1 is left undefined (junk). The Broadway Manual has zero information about what occurs for frsqrte in regards to HID2 PSE. Nothing. So anyway I did some tests on Real Hardware and this is what occurs~ When HID2 PSE is low 1. Entire 64-bit frB used for input 2. frD gets result as 64-bits When HID2 PSE is high 1. ps0 of frB is used for input 2. ps0 of frD gets result. ps1 of frD is left UNCHANGED If anyone cares this is the following code I used for confirmation. f1 gets loaded as 4.0, 0.0. f2 gets loaded as 1.25, 1.25. I then perform a frsqrte (with HID2 PSE high) where f1 is frB and f2 is frD. f2 resulted as 0x3EFFF400 3FA00000 (~0.499, 1.25). ps0 of f2 didn't result exactly as 0.5, this is expected due to the 1/4096 accuracy effect. In conclusion this information is missing from the 1.0 version of the Broadway Manual which is the latest version afaik. Code:```#C2 Address 807BA164 #Pick up box, see result on screen .set HID2, 920 .set PSE, 0x2000 #Check PSE bit of HID2 mfspr r3, HID2 andis. r0, r3, PSE bne+ good_to_go #Set r5 arg for OSFatal bl setbadfatal .asciz "NOTE! PSE of HID2 is low. Try again or try a diff hook address." setbadfatal: mflr r5 b goodfatal #Set our single float constant value (4) good_to_go: lis r3, 0x4080 ori r3, r3, 0x0000 li r4, 0 lis r5, 0x3FA0 #Place value into ps0 of f1 with ps1 being null lis r31, 0x8000 ori r31, r31, 0x1500 stw r3, 0 (r31) stw r4, 4 (r31) stw r5, 8 (r31) stw r5, 0xC (r31) psq_l f1, 0 (r31), 0, 0 psq_l f2, 0x8 (r31), 0, 0 #Do square root on it frsqrte f2, f1 #Store results psq_st f2, 0x0 (r31), 0, 0 #Load the fpr into GPRs r5 and r6 for sprintf lwz r5, 0 (r31) lwz r6, 4 (r31) #Set r4 arg for sprintf bl setsprintf .asciz "%08X %08X" .align 2 setsprintf: mflr r4 #Set r3 arg for sprintf addi r3, r31, 0x40 #Clear cr1 eq bit cuz no floats for sprintf crclr 6 #Call sprintf lis r12, 0x8001 ori r12, r12, 0x1A2C mtctr r12 bctrl addi r5, r31, 0x40 #Setup OSFatal args goodfatal: bl setupfatal .long 0xFFFFFFFF .long 0 setupfatal: mflr r3 addi r4, r3, 4 #Call OSFatal lis r12, 0x801A ori r12, r12, 0x4EC4 mtctr r12 bctr``` « Next Oldest | Next Newest »

Forum Jump:

Users browsing this thread: 1 Guest(s)