Next: , Previous: , Up: Basic Editing   [Contents][Index]

### 3.7 Big and Little Endians

When talking about whole numbers (integers) we should distinguish between their value (such as 123) and their written form that we would use when writing the number on a piece of paper, such as `123`.

The written form of a number is composed of digits, arranged in certain order. We all know that the ordering of the digits in the written form of a number is important: if we write `123` we are referring to a different value than if we write `321`. The mathematical reason for this is that depending on the position they occupy in the written form, each digit contributes with a different “weight” to the total value of the number. This is always the case, regardless of the numerical base used to denote the number.

For example, the value of the number 123 (whose written form is `123`) is calculated as `1*10^2+2*10^1+3*10^0`. If we swap the last two digits in the written form of the number, we have `1*10^2+3*10^1+2*10^0`, which results in a different value: `132`. When we consider other numerical bases, the bases in the polynomial change accordingly, but the correspondence between written form and value stands: for example, the value of 0x123 is calculated as `1*16^2+2*16^1+3*16^0`.

The “higher” a digit is in the polynomial, the more significant it is, i.e. the more weight it has on the value of the number where it appears. In the written number `123`, for example, the digit 1 is the most significant digit of the number, and the digit 3 is the least significant digit.

This distinction between the written form of a number and its value is very important. Just like in certain languages letters are read right-to-left (Arabic) or even down-to-up (Japanese) we could certainly conceive a language in which the digits of numbers were arranged from right-to-left instead of left-to-right. In such a language the written representation of 123 would be `321`, not `123`. In other words: the least significant digit would come first, not last, in the written form of the number.

Now when it comes to store numbers in computers, rather than writing them on a paper, the role of the paper is played by the computer’s memory, be it ephemeral (like RAM) or persistent (like a spinning hard disk or a Flash memory), which is organized as a sequence of bytes. Since we are composing numbers with bytes, it makes sense to have each byte to play the role of a digit in the written form of the bigger number. Since bytes can have values from 0 to 255, the base is 256. But what is the “written form” for our byte-composed numbers?

In the last section we tried to compose bigger integers by concatenating bytes together and interpreting the result. In doing so, we assumed (quite naturally) that in the written form of the resulting integer the bytes are ordered in the same order than they appear in the file, i.e. we assume that the written form of the number `b1*256^2+b2*256^1+b3*256^0` would be `b1b2b3`, where `b1`, `b2` and `b3` are bytes. In other words, given a written form `b1b2b3`, `b1` would be the most significant byte (digit) and `b3` would be the least significant byte (digit). In our world of IO spaces, the “written form” is the disposition of the bytes in the IO space (file, memory buffer, etc) being edited.

That interpretation of the written form is exactly what the bit-concatenation operator implements:

```(poke) dump :from 0#B :size 3#B
76543210  0011 2233 4455 6677 8899 aabb ccdd eeff  0123456789ABCDEF
00000000: 7f45 4c                                  .EL
(poke) var b1 = byte @ 0#B
(poke) var b2 = byte @ 1#B
(poke) var b3 = byte @ 2#B
(poke) b1:::b2:::b3
(uint<24>) 0x7f454c
```

However, much like in certain human languages the written form is read from right to left, some computers also read numbers from right to left in their “written form”. Actually, turns out that most modern computers do it like that. This means that, in these computers, given the written form `b1b2b3` (i.e. given a file where `b1` comes first, followed by `b2` and then `b3`) the most significant byte is `b3` and the least significant byte is `b1`. Therefore, the value of the number would be `b3*256^2+b2*256^1+b3*256^0`.

So, given the written form of a bigger number `b1b2b3` (i.e. some ordering of bytes implied by the file they are stored in) there are at least two ways to interpret them to calculate the value of the number. When the written form is read from left to right, we talk about a big endian interpretation. When the written form is read from right to left, we talk about a little endian interpretation.

Given the first three bytes in `foo.o`, we can determine the value of the integer composed of these three bytes in both interpretations:

```(poke) b1:::b2:::b3
(uint<24>) 0x7f454c
(poke) b3:::b2:::b1
(uint<24>) 0x4c457f
```

Remember how the type specifier `byte` is just a synonym of `uint<8>`, and how we can use type specifiers like `uint<24>` and `uint<32>` to map bigger integers? When we do that, like in:

```(poke) uint<24> @ 0#B
(uint<24>) 0x7f454c
```

Poke should somehow decide what kind of interpretation to use, i.e. how to read the “written form” of the number. As you can see from the example, poke uses the left-to-right interpretation, or big-endian, by default. But you can change it using a new dot-command: `.set endian`:

```(poke) .set endian little
(poke) uint<24> @ 0#B
(uint<24>) 0x4c457f
```

The currently used interpretation (also called endianness) is shown if you invoke the dot-command without an argument4:

```(poke) .set endian
little
```

Different systems use different endianness. Into a given system, it is to be expected that most files will be encoded following the same conventions. Therefore poke provides you a way to set the endianness to whatever endianness is in the system. You do it this way:

```(poke) .set endian host
```

### (4)

This also applies to the other `.set` commands

Next: , Previous: , Up: Basic Editing   [Contents][Index]