AArch64/ARM64 Tutorial

Chapter 2: Basic Vocabulary, Data/Number Types

As mentioned earlier, the 0's and 1's that a CPU can understand are called Binary Numbers. Binary Numbers are the simplest/lowest number form. A Binary value is also known as a Bit. Bits can only be 0 or 1, nothing else. Before you can understand Binary, you will need to learn Hexadecimal. First off, regular Decimal is something you already know. It's the basic numbers that everyone uses on a regular basis. Such as 3, 16, 2057, 5168430, etc. The regular Decimal number system uses what is called Base 10. You start at 0 and go to 9. After 9, you must start a new "base of 10". This new base starts at 10 and goes thru 19. After 19, you have to start another new base of 10 at the number 20..

Hexadecimal uses a Base 16 system. Similar to decimal, you start with 0. However, once you get to 9, you proceed to A, the first letter of the English Alphabet. You keep proceeding through the Alphabet til you hit F. 0 thru F are 16 total numbers/values. This is the first "base of 16" After F, you can then go to 10. 10 thru 1F is the next "Base of 16". After that would be 20 thru 2F, etc etc.

Here's a basic decimal to hex conversion chart~

Decimal Hex
 0  0
 1   1
 ..  ..
 9  9
 10  A
 11  B
 12  C
 ..  ..
 15  F
 16  10
 17  11
 ..  ..
 31  1F
 32  20
 33  21
 ..  ..
 159  99
 160  A0
 161  A1
 ..  ..
 255  FF
 256  100
 257  101

Once numbers become pretty large, trying to manually convert Decimal to Hex is silly. Instead, here's a Decimal-to-Hex converter - > HERE

You can also use it to convert in the opposite fashion. The knowledge of Hex numbers is required because many values of a CPU are usually shown in Hex form within a debugging tool. Also, you will need to learn Binary which Hex is required beforehand.

Speaking of Binary, we can now dive into it!

Every Hexadecimal digit can be represented by 4 consecutive binary values (bits). Like this...

Hex Binary
 0  0000
 1   0001
 2  0010
 3  0011
 4  0100
 5  0101
 6  0110
 7  0111
 8  1000
 9  1001
 A  1010
 B  1011
 C  1100
 D  1101
 E  1110
 F  1111

As you can see, the chart is pretty easy to remember. Trying to convert Decimal to Binary is more difficult. Now what about going beyond the binary value 1111? Simple, like this..

Hex = Binary
10 = 0001 0000
11 = 0001 0001
12 = 0001 0010
.. ..
19 = 0001 1001
1A = 0001 1010
.. ..
1F = 0001 1111
20 = 0010 0000

I've separated the binary values via pairs of four bits, since every hex digit can be converted to its 4-digit Binary value. The first 4-bit pair is blue in color. The second pair is violet in color. Binary is crucial in Assembly for what is known as Logical Operations (a family/type of CPU instructions). Logical Operations are preformed on a bit by bit basis. But let's not get ahead of ourselves here, we still have a lot of other Basics to cover before going into something such as Logical Operations.

One Final Note about Hex:
Hex values present in an Assembly Source file are always designated via a "0x". For example, the Hex Value BC needs to be written as 0xBC.

We got Hex done, check. Binary done, check. Now let's move on into some key essential Vocabulary every Assembly Developer must know. For starters, the term "null" simply means zero. You will come across that term on a frequent basis in this tutorial. For Assembly Language, special terms are used to describe certain lengths of Binary/Bit values. Here's the list...

8 Bits = Byte (2 Hex digits) Example: 0x44
16 Bits = Halfword (4 Hex digits) Example: 0xB0C8
32 Bits = Word (8 Hex digits) Example: 0xDEDD0020
64 Bits = Double-Word (16 Hex digits)
128 Bits = Quadword (32 Hex digits)

We need to discuss some common terminology regarding portions of values within values. For example, lets say we have the following value...

0x8045CD0A

The "8045" portion (lefthand 16-bits) of the word value is known as the upper 16-bits. The "CD0A" portion (righthand 16-bits) is known as the lower 16-bits. This same concept can be applied to any value length. For example, we have the following double-word value...

0xFFFFFFFF80007774

The "FFFFFFFF" portion is known as the upper 32-bits while the "80007774" portion is known as the lower 32-bits. What's important is that you understand the lefthand = upper, and righthand = lower.


Final Chapter NOTE: ARMv8 AArch64 Instruction Length
ARMv8 AArch64 is an Assembly language that uses the same sized Bit blocks for all of its instructions. All instructions are 32-bits (word) in length.


Next Chapter

Tutorial Index