All About Bit Rotation

For Advanced ASM Coders

Chapter 1: Overview

If you have been making codes, even for a tiny bit, you have definitely came across an instruction like this...

rlwinm r0, r5, 3, 0, 28

This is a bit rotation instruction. There are 3 major bit rotation instructions...

- rlwnm - Rotate Left Word Then And w/ Mask

- rlwinm - Rotate Left Word Immediate Then And w/ Mask

- rlwimi - Rotate Left Word Immediate Then Insert Mask

These rotation instructions are like PowerPC's Swiss army knife. They can do many tasks such as multiply, divide, shift, rotate, clear, clear then shift, etc. The instructions can also be used to check for negative values, check for alignment, etc. With these instructions being so useful, it's recommended you at least learn a little bit about bit rotation.

Even though there are only 3 major rotating instructions (aka standard mnemonics), there are many simplified mnemonics available.

Chapter 2: Basic Bit Clearing

There is a simplified mnemonic of rlwinm for clearing bits. Clearing a bit simply means setting it to 0.

clrlwi rD, rA, XX

clrrwi rD, rA, XX

clrlwi is Clear Left Word Immediate. This instruction will clear the upper (left-hand) bits (starting at bit 0 going right) of rA, via the XX value (XX value range is 1 thru 31), and the result is placed in rD.

clrrwi is Clear Right Word Immediate. This will clear the lower (right-hand) bits (starting at bit 31 going left) of rA, and place result in rD.

Let's say we have a value in r5 and that value is 0x1AAB000F. This in binary is...

0001 1010 1010 1011 0000 0000 0000 1111

If we execute a Clear Right instruction of "clrrwi r5, r5, 22", we will need to zero-out the lower/right-hand 22 bits of r5. The 10 upper/left-hand bits are left alone. r5's result in binary would now be..

0001 1010 1000 0000 0000 0000 0000 0000

The bits in blue were the bits that were cleared by the instruction. Convert the result back to hex, and r5 is now 0x1A800000. Notice how the 3rd digit of r5 went from A to 8. This is because bits 8 thru 11 went from 1010 (0xA) to 1000 (0x8).

Chapter 3: Basic Bit Shifting

Clearing was easy enough, let's dive in bit shifting. Instead of zero-ing out bits, we will 'move/shift' them. We can shift them either left or right.

Here are two basic shifting instructions...

slwi rD, rA, XX

srwi rD, rA, XX

The XX value designates how many bits to shift by, XX can be anything from 1 to 31. There are also these shifting instructions..

slw rD, rA, rB

srw rD, rA, rB

It's the same thing as before but the amount to shift is in a source register instead of an immediate value.

Let's go over the execution of a srwi instruction. Let's say r5 has a value of 0x0000FE1F. This in binary is...

0000 0000 0000 0000 1111 1110 0001 1111

Take note of the purple '1'. If we execute the instruction of "srwi r5, r5, 15", r5's result in binary is...

0000 0000 0000 0000 0000 0000 0000 0001

r5's result is 0x00000001. Notice where the purple '1' is at now. It was moved/shifted to the right by 15 bits. For rightward shifts, whatever bits that went beyond bit 31 are thrown away. Same rule applies if you were shifting bits toward the left and they went beyond bit 0. Notice the bits in blue. These zero bits were placed in because bits on the lefthand side went missing due to the rightward shift. You can call the zero bits the "replacement bits". For slw, slwi, srw, and srwi, the replacement bits are always zero.

Let's take a look an at example shifting left...

Example: "slwi r6, r6, 2" r6 starts off as 0x80000007. Binary form is...

1000 0000 0000 0000 0000 0000 0000 0111

After instruction is executed. The result in binary is...

0000 0000 0000 0000 0000 0000 0001 1100

Hex result is 0x0000001C. The red 1 is thrown away because it was shifted beyond bit 0. The three green 1's are shifted accordingly, and the blue bits are the replacement bits.

There's still another shifting instruction that is available to use. It is however not a simplified mnemonic of rlwinm, but it should be talked about here. It is a standard-mnemonic instruction.

srawi rD, rX, rXX

This is Shift Right Algebraic Word Immediate. It operates just like srwi, however when the register bits are shifted to the right, bit 0's value (before the shift) will be copied and used as the replacement bits into any new bit slots that were opened up on the left hand side due to the rightward shift. Confused? Let's over go an example:

srawi r0, r3, 3

Pretend r3 starts off as 0xFFFFFFF0.

r3 in binary is...

1111 1111 1111 1111 1111 1111 1111 0000

Take note of bit 0 in green. It's value will be use as the 'copy value' for new bit slots that will be opened up by the rightward shift.

Now let's say we execute the instruction. r3 in binary form is...

1111 1111 1111 1111 1111 1111 1111 1110

Notice bits 0 thru 3 are in orange. These bits are the replacement bits and are copies of bit 0's value (what bit 0 was before the rightward shift).

r3 as a result = 0xFFFFFFFE.

Chapter 4: Multiply/Divide Conversions

Shifting left/right can be used as a tool for multiplying and dividing if and only if the multiplier/divisor is a power of 2 and it is no more than 256. It's suggested you use the shifting instructions instead of the multiply/divide equivalents as this boosts performance of Broadway.

Multiply Conversion (XXX is a power of 2 that isn't greater than 256):

mulli rD, rA, XXX = slwi rD, rA, BB

Divide (unsigned/logical) Conversion (rB's value is a power of 2 that isn't greater than 256):

divwu rD, rA, rB = srwi rD, rA, BB

Divide (signed) Conversion (rB's value is a power of 2 that isn't greater than 256):

divw rD, rA, rB = srawi rD, rA, BB

XXX/rB conversion to BB formula

2 = 1

4 = 2

8 = 3

16 = 4

32 = 5

64 = 6

128 = 7

256 = 8

Chapter 5: Disassembling rlw Type Standard Mnemonics... (Pt 1: rlwinm and rlwnm)

Dolphin displays all bit-rotating instructions in their standard mnemonic form. So does all PPC Disassemblers. This can be frustrating as some bit clearing/shifting instructions are easier to read in their simplified mnemonic form.

rlwinm r0, r5, 3, 0, 28

At this point, you are probably thinking what the hell the values 3, 0, and 28 are used for.

The 3 is called the 'SH'. It's the amount of digits to rotate to the left. Rotating is DIFFERENT than shifting. You still shift to the left, but whatever bits that are shifted leftward past bit 0, are cycled back and looped back into the lower/righthand bits, instead of being thrown away. Thus, bits are rotated in a counter-clockwise manner. For example if you rotated a value by 1 bit. Bit 0 now goes to Bit 31. Bit 1 to 0. Bit 2 to 1, etc etc.

Here's a picture of shifting left vs rotating left - https://mariokartwii.com/pics/tut/shiftRot.gif

The 0 in our rlwinm instruction is the 'MB' (Mask beginning), and the 28 is the 'ME' (Mask end). The MB and ME creates a string of 1's aka a Mask. The MB is the bit where the mask (string of 1's) begins, the ME is the bit where the mask (string of 1's) ends.

So an MB of 0 with an ME of 28 means that bits 0 thru 28 are all set to 1, and all other bits (29, 30, 31) are set to 0. In binary form our Mask is this...

1111 1111 1111 1111 1111 1111 1111 1000

In hex form, that value would is 0xFFFFFFF8. Fyi: Instead of using decimal values to show MB/ME in the rlwinm instruction, you can also show the Mask in full Hex form like this...

rlwinm r0, r5, 3, 0xFFFFFFF8

Let's say we have the value 0x81234567 in r5, and we executed the rlwinm from above. What we do first is ROTATE all the bits of r5 leftward/counter-clockwise by 3.

Before rotation (each quad group of bits are color coded to help you visual the rotation)~

1000 0001 0010 0011 0100 0101 0110 0111

After rotation~

0000 1001 0001 1010 0010 1011 0011 1100

As you can see bit 0 is now bit 29! So the result in hex is now 0x091A2B3C. We take this current temporary rotated result, and logically AND it with our Mask (0xFFFFFFF8).

The final result (r0) is 0x091A2B38. For the rlwnm instruction (Rotate Left Word Then And w/ Mask) it's the same procedure except instead of using an immediate value for the initial rotation, that value resides in a second source register.

Chapter 6: Disassembling rlw Type Standard Mnemonics... (Pt 2: rlwimi)

The 3rd 'rlw' type standard mnemonic is rlwimi. Rotate Left Word Immediate Then Mask Insert. This still involves rotating the bits but there is no logical AND'ing.

This instruction is going to be a bit difficult to explain as it even took me a while to figure this out and there is no really good information anywhere on the net explaining this instruction in a 'noob sense'. Let's look at an example instruction...

rlwimi r6, r4, 2, 0, 29

So you should already know that we will rotate the contents of r4 by 2 bits. Let's say BEFORE the rotation, r4 is this...

0x4455AA01

In binary that is....

0100 0100 0101 0101 1010 1010 0000 0001

After rotation, r4 is now...

0001 0001 0101 0110 1010 1000 0000 0101

Which in hex is 0x1156A805. We know that the mask (0, 29) is 0xFFFFFFFC. With the rlwimi instruction, there is NO logical ANDing. What we need to do is look at what is CURRENTLY in r6 (the Destination Register).

Let's say r6 is 0x0000FFFF. Binary form is...

0000 0000 0000 0000 1111 1111 1111 1111

With our 0xFFFFFFFC mask, if a bit in the mask is 1, then whatever bits are in our rotated r4 register will replace whatever bits are in r6. If a bit in the mask is 0, then those bits in r6 are preserved (not replaced by bits in r4!)

With all of that being said, that means r6's bits 30 & 31 do NOT change. And the bits 0 thru 29 of our rotated r4 data replaces bits 0 thru 29 of r6. r6 will thus equal (in binary)..

0001 0001 0101 0110 1010 1000 0000 0111

Which in hex is 0x1156A807. The bits in BLUE are the blues that were used/replaced by the rotated r4 value. The bits in RED were the bits that were preserved (not replaced).

If you have a rlwimi instruction where the destination register is same as the source register (let's say r4). Then the Mask comparison is done w/ old r4 (BEFORE the initial rotation).

Chapter 7: Simplified Mnemonic List

Name = Simplified Menomnic = Standard Menomic

Extract & Left Justify Word Immediate = extlwi rD, rA, n, b (n > 0) = rlwinm rD, rA, b, 0, n - 1

Extract & Right Justify Word Immediate = extrwi rD, rA, n, b, (n > 0) = rlwinm rD, rA, b + n, 32 - n, 31

Insert From Left Word Immediate = inslwi rD, rA, n, b (n > 0) = rlwimi rD, rA, 32 - b, b , (b + n) - 1

Insert From Right Word Immediate = insrwi rD, rA, n , b (n > 0) = rlwimi rD, rA, 32 - (b + n), b, (b + n) - 1

Rotate Left Word Immediate = rotlwi rD, rA, n = rlwinm rD, rA, n, 0, 31

Rotate Right Word Immediate = rotrwi, rD, rA, n = rlwinm, rD, rA, 32 - n, 0, 31

Rotate Word Left = rotlw rD, rA, rB = rlwnm rD, rA, rB, 0, 31

Shift Left Word Immediate = slwi rD, rA, n (n < 32) = rlwinm rD, rA, n, 0, 31 -n

Shift Right Word Immediate = srwi rD, rA, n (n < 32) = rlwinm rD, rA, 32 - n, n, 31

Clear Left Word Immediate = clrlwi rD, rA, n (n < 32) = rlwinm rD, rA, 0, n, 31

Clear Right Word Immediate = clrrwi rD, rA, n (n < 32) = rlwinm rD, rA, 0, 0, 31 - n

Clear Left And Shift Left Word Immediate = clrlslwi rD, rA, b, n, (n ≤ b ≤ 31) = rlwinm rD, rA, n, b - n, 31 - n

• Extract — Select a field of n bits starting at bit position b in the source register; left or right justify

this field in the target register; clear all other bits of the target register.

• Insert — Select a left- or right-justified field of n bits in the source register; insert this field starting

at bit position b of the target register; leave other bits of the target register unchanged.

• Rotate — Rotate the contents of a register right or left n bits without masking.

• Shift — Shift the contents of a register right or left n bits, clearing vacated bits (logical shift).

• Clear — Clear the leftmost or rightmost n bits of a register.

• Clear left and shift left — Clear the leftmost b bits of a register, then shift the register left by n bits. This operation can be used to scale a (known non-negative) array index by the width of an element.

Chapter 8: Important Note regarding Dolphin

Dolphin displays a hexadecimal mask in parenthesis for every rotate instruction in it's Code View. However, it is not the typical hex mask value you would use for anding (rlwinm) or masking (rlwimi). Dolphin's mask is a display of the bits before the rotation that would make it into the mask.

Chapter 9: More Examples

To clear a specific bit (i.e. clearing just bit 21 of r3)

rlwinm r3, r3, 0, 22, 20 #mask of 0xFFFFFBFF

To overwrite a single bit of the Destination Register using what is in the Source Register. (r3's bit 16 value overwrites r12's bit 16 value

rlwimi r12, r3, 0, 16, 16 #mask of 0x00008000

Regardless of what value bits 30 & 31 were beforehand. Flip them, and re-insert. Register in question will be r4 with using r0 as a scrap register

not r0, r4

rlwimi r4, r0, 0, 30, 31

To swap the upper and lower 16 bits (of r3)

rlwinm r3, r3, 16, 0, 31 #mask of 0xFFFFFFFF

simplified mnemonic: rotlwi r3, r3, 16 or rotrwi r3, r3, 16

To take a byte and copy it to another place in the same register (r3 used for example)

0x12345678 --> 0x12347878

rlwimi r3, r3, 8, 0x0000FF00

simplified mnemonic: insrwi r3, r3, 8, 16

To extract a byte and cut-paste it to another place in the register and all other data in the register is wiped

0x12345678 --> 0x00000034

rlwinm r3, r3, 16, 24, 31 #mask of 0x000000FF

simplified mnemonic: extrwi r3, r3, 8, 8

NOTE: PowerPC does not come with clrlw/clrrw (clear left word / clear right word) instructions.

To achieve a 'clear left word' (ex: clrlw rD, rA, rB), execute these two instructions...

slw rD, rA, rB

srw rD, rA (rD from the slw), rB

For a 'clear right word', you do the opposite...

srw rD, rA, rB

slw rD, rA (rD from the srw), rB

And that's it! Still confused? Don't be afraid to ask questions!

Credits to NXP (AN2491 pdf file) for Chapter 7 info.