Most coders here are familiar with the Broadway Manual. If anybody has also read through it, you will find it littered with diagram mistakes and typoes. Considering it's a preliminary file that was never meant for the public, this is sort of expected.

However, I came across a decent mistake (or information left missing) regarding the description of the frsqrte instruction. I've been working on a Broadway PPC Instruction Simulator recently and have to make sure it's as accurate as possible. So I've been combing thru the manual at times.

According to the Broadway Manual (page 426), the frsqrte instruction does this~

1. frB used as input (64-bit input)

2. Reciprocal of Square Root Estimate is performed (within 1/4096 accuracy)

3. frD gets result as 64-bit

What's odd is that there is no mention about how the instruction operates in regards to the HID2 PSE bit. I've tried reading other Chapters, and there's nothing explicit. As an fyi, for every single-float instruction (except for fabs, fneg, and fnabs), the operation of said instruction varies depending on whether PSE bit is low or high.

For example, the fmr instruction.

When HID2 PSE is low

1. Entire Value of frB is copied to frD

When HID2 PSE is high

2. ps0 of frB copied to ps0 of frD. ps1 of frD is left UNCHANGED.

Another example (fdivs)

When HID2 PSE is low

1. Entire value of frA is divided by entire value of frB

2. Result placed in frB as 64-bit form but with single precision ofc

When HID2 PSE is high

1. ps0 of frA is divided by ps0 of frB

2. result placed into BOTH ps0 and ps1 of frD

Basically in all the single-float math-type instructions, when HID2 PSE is high, ps0 is input and ps0 + ps1 is output. As you can see above, the fmr instruction is an old ball with frD ps1 being unchanged. The frsp instruction is another odd ball to where the frD's ps1 is left undefined (junk).

The Broadway Manual has zero information about what occurs for frsqrte in regards to HID2 PSE. Nothing. So anyway I did some tests on Real Hardware and this is what occurs~

When HID2 PSE is low

1. Entire 64-bit frB used for input

2. frD gets result as 64-bits

When HID2 PSE is high

1. ps0 of frB is used for input

2. ps0 of frD gets result. ps1 of frD is left UNCHANGED

If anyone cares this is the following code I used for confirmation. f1 gets loaded as 4.0, 0.0. f2 gets loaded as 1.25, 1.25. I then perform a frsqrte (with HID2 PSE high) where f1 is frB and f2 is frD.

f2 resulted as 0x3EFFF400 3FA00000 (~0.499, 1.25). ps0 of f2 didn't result exactly as 0.5, this is expected due to the 1/4096 accuracy effect.

In conclusion this information is missing from the 1.0 version of the Broadway Manual which is the latest version afaik.

However, I came across a decent mistake (or information left missing) regarding the description of the frsqrte instruction. I've been working on a Broadway PPC Instruction Simulator recently and have to make sure it's as accurate as possible. So I've been combing thru the manual at times.

According to the Broadway Manual (page 426), the frsqrte instruction does this~

1. frB used as input (64-bit input)

2. Reciprocal of Square Root Estimate is performed (within 1/4096 accuracy)

3. frD gets result as 64-bit

What's odd is that there is no mention about how the instruction operates in regards to the HID2 PSE bit. I've tried reading other Chapters, and there's nothing explicit. As an fyi, for every single-float instruction (except for fabs, fneg, and fnabs), the operation of said instruction varies depending on whether PSE bit is low or high.

For example, the fmr instruction.

When HID2 PSE is low

1. Entire Value of frB is copied to frD

When HID2 PSE is high

2. ps0 of frB copied to ps0 of frD. ps1 of frD is left UNCHANGED.

Another example (fdivs)

When HID2 PSE is low

1. Entire value of frA is divided by entire value of frB

2. Result placed in frB as 64-bit form but with single precision ofc

When HID2 PSE is high

1. ps0 of frA is divided by ps0 of frB

2. result placed into BOTH ps0 and ps1 of frD

Basically in all the single-float math-type instructions, when HID2 PSE is high, ps0 is input and ps0 + ps1 is output. As you can see above, the fmr instruction is an old ball with frD ps1 being unchanged. The frsp instruction is another odd ball to where the frD's ps1 is left undefined (junk).

The Broadway Manual has zero information about what occurs for frsqrte in regards to HID2 PSE. Nothing. So anyway I did some tests on Real Hardware and this is what occurs~

When HID2 PSE is low

1. Entire 64-bit frB used for input

2. frD gets result as 64-bits

When HID2 PSE is high

1. ps0 of frB is used for input

2. ps0 of frD gets result. ps1 of frD is left UNCHANGED

If anyone cares this is the following code I used for confirmation. f1 gets loaded as 4.0, 0.0. f2 gets loaded as 1.25, 1.25. I then perform a frsqrte (with HID2 PSE high) where f1 is frB and f2 is frD.

f2 resulted as 0x3EFFF400 3FA00000 (~0.499, 1.25). ps0 of f2 didn't result exactly as 0.5, this is expected due to the 1/4096 accuracy effect.

In conclusion this information is missing from the 1.0 version of the Broadway Manual which is the latest version afaik.

Code:

`#C2 Address 807BA164`

#Pick up box, see result on screen

.set HID2, 920

.set PSE, 0x2000

#Check PSE bit of HID2

mfspr r3, HID2

andis. r0, r3, PSE

bne+ good_to_go

#Set r5 arg for OSFatal

bl setbadfatal

.asciz "NOTE! PSE of HID2 is low. Try again or try a diff hook address."

setbadfatal:

mflr r5

b goodfatal

#Set our single float constant value (4)

good_to_go:

lis r3, 0x4080

ori r3, r3, 0x0000

li r4, 0

lis r5, 0x3FA0

#Place value into ps0 of f1 with ps1 being null

lis r31, 0x8000

ori r31, r31, 0x1500

stw r3, 0 (r31)

stw r4, 4 (r31)

stw r5, 8 (r31)

stw r5, 0xC (r31)

psq_l f1, 0 (r31), 0, 0

psq_l f2, 0x8 (r31), 0, 0

#Do square root on it

frsqrte f2, f1

#Store results

psq_st f2, 0x0 (r31), 0, 0

#Load the fpr into GPRs r5 and r6 for sprintf

lwz r5, 0 (r31)

lwz r6, 4 (r31)

#Set r4 arg for sprintf

bl setsprintf

.asciz "%08X %08X"

.align 2

setsprintf:

mflr r4

#Set r3 arg for sprintf

addi r3, r31, 0x40

#Clear cr1 eq bit cuz no floats for sprintf

crclr 6

#Call sprintf

lis r12, 0x8001

ori r12, r12, 0x1A2C

mtctr r12

bctrl

addi r5, r31, 0x40

#Setup OSFatal args

goodfatal:

bl setupfatal

.long 0xFFFFFFFF

.long 0

setupfatal:

mflr r3

addi r4, r3, 4

#Call OSFatal

lis r12, 0x801A

ori r12, r12, 0x4EC4

mtctr r12

bctr