Next: Poking Bytes, Previous: Files as IO Spaces, Up: Basic Editing [Contents][Index]
Data stored in modern computers, in both volatile memory and persistent files, is fundamentally a sequence of entities called bytes. The bytes can be addressed by its position in the sequence, starting with zero:
+--------+--------+--------+ ... +--------+ | byte 0 | byte 1 | byte 2 | | byte N | +--------+--------+--------+ ... +--------+
Each byte has capacity to store a little unsigned integer in the range
0..255
. Therefore, the IO spaces that we edit with poke (like
the file foo.o) can be seen as a sequence of little numbers,
like depicted in the figure above.
GNU poke provides a command whose purpose is to display the values of
these bytes: dump
1 . It is called like that
because it dumps ranges of bytes to the terminal, allowing the user to
inspect them.
So let’s use our first poke command! Fire up poke, open the file
foo.o as explained above, and execute the dump
command:
(poke) dump 76543210 0011 2233 4455 6677 8899 aabb ccdd eeff 0123456789ABCDEF 00000000: 7f45 4c46 0201 0100 0000 0000 0000 0000 .ELF............ 00000010: 0100 f700 0102 0000 0000 0000 0000 0000 ................ 00000020: 0102 0000 0000 0000 9801 0000 0000 0000 ................ 00000030: 0000 0000 4000 0000 0000 4000 0800 0700 .............. 00000040: 2564 0a00 0000 0000 0000 0000 0000 0000 %d.............. 00000050: b702 0000 0100 0000 1801 0000 0000 0000 ................ 00000060: 0000 0000 0000 0000 8510 0000 ffff ffff ................ 00000070: b700 0000 0000 0000 9500 0000 0000 0000 ................ (poke)
What are we looking at?
The first line of the output, starting with 76543210
, is a
ruler. It is there to help us to visually determine the
location (or offset) of the data.
The rest of the lines show the values of the bytes that are stored in the file, 16 bytes per line. The first column in these data lines shows the offset, in hexadecimal and measured in number of bytes, from which the row of data starts. For example, the offset of the first byte shown in the third data line has offset 0x20 in the file, the second byte has offset 0x21, and so on. Note how the data rows show the values of the individual bytes, in hexadecimal. Generally speaking, when dealing with bytes (and binary data in general) it is useful to manipulate magnitudes in hexadecimal, or octal. This is because it is easy to group digits in these bases to little groups of bits (four and three respectively) in the equivalent binary representation. In this case, each couple of hexadecimal digits denote the value of a single byte2. For example, the value of the first byte in the third data row is 0x01, the value of the second byte 0x02, and so on.
Using the ruler and the column of offsets, locating bytes in the data
is very easy. Let’s say for example we are interested in the byte at
offset 0x68: we use the first column to quickly find the row starting
at 0x60, and the ruler to find the column marked with 88
.
Cross column and row and… voila! The byte in question has the value
0x85. The reverse process is just as easy. What is the offset of the
first 0x40 in the file? Try it!
The section at the right of the output is the ASCII output. It shows
the row of bytes at the left interpreted as ASCII characters.
Non-printable characters are shown as .
to avoid scrambling the
terminal, and yes, there is actually way to customize what character
to use, so they are not confused from real ASCII dot characters
(0x2e
) :P In this particular dump we can see that near the
beginning of the file there are three bytes whose value, if
interpreted as ASCII characters, conform the string “ELF”. As we
shall see, this is part of the ELF magic number. Again, the ruler is
very useful to locate the byte corresponding to some character in the
ASCII section, or the other way around. What is the value of the byte
corresponding to the F
in ELF
? Try it!
Something to notice in the dump
output above is that these
are not, by any mean, the complete contents of the file foo.o.
The .info ios
dot-command informed us in the last section
that foo.o contains 920 bytes, of which the dump
command only showed us… 0x80
bytes, or 128
bytes in
decimal.
dump
is certainly capable of showing more (and less) than
128
bytes. We can ask dump
to display some given
amount of data by specifying its size using a command argument.
For example:
(poke) dump :size 64#B 76543210 0011 2233 4455 6677 8899 aabb ccdd eeff 0123456789ABCDEF 00000000: 7f45 4c46 0201 0100 0000 0000 0000 0000 .ELF............ 00000010: 0100 f700 0102 0000 0000 0000 0000 0000 ................ 00000020: 0102 0000 0000 0000 9801 0000 0000 0000 ................ 00000030: 0000 0000 4000 0000 0000 4000 0800 0700 ..............
The command above asks poke to “dump 64 bytes”. In this example
:size
is the name of the argument, and 64#B
is the
argument’s value. Again, the suffix #B
tells poke we want to
dump 64 bytes, not 64 kilobits nor 64 potatoes.
Another interesting aspect of our first dump (ahem) is that the dumped
bytes start from the beginning of the file, i.e. the offset of the
first byte is 0x0. Certainly there should be other areas of the file
with interesting contents for us to inspect. To that purpose, we can
use yet another option, :from
:
(poke) dump :size 64#B :from 128#B 76543210 0011 2233 4455 6677 8899 aabb ccdd eeff 0123456789ABCDEF 00000080: 1400 0000 0000 0000 0000 0000 0000 0000 ................ 00000090: 0000 0000 0000 0000 0000 0000 0000 0000 ................ 000000a0: 0000 0000 0300 0100 0000 0000 0000 0000 ................ 000000b0: 0000 0000 0000 0000 0000 0000 0300 0300 ................
The command above asks poke to “dump 64 bytes starting at 128 bytes
from the beginning of the file”. Note how the first row of bytes
start at offset 0x80
, i.e. 128
in decimal.
Passing options to commands is easy and natural, but we may find
ourselves passing the same values again and again to certain command
options. For example, if the default size of dump
of 128
bytes is not what you prefer, because you have a particularly tall
monitor, or you are one of these people using sub-atomic sized fonts,
it can be tiresome and error-prone to pass :size
to
dump
every time you use it. Fortunately, the default size
can be customized by setting a global variable:
(poke) pk_dump_size = 160#B
This tells poke to set 160
bytes as the new value for the
pk_dump_size
variable. This is a global variable that the
dump
command uses to determine how much data to show if the
user doesn’t specify an explicit value with the :size
option.
Many other commands use the same strategy in order to alter their
default behavior, not just dump
.
And now that we are talking about that, it is also cumbersome to have
to set the default size used by dump
every time we run
poke. But no problem, just set the variable in a file called
.pokerc in your home directory, like this:
pk_dump_size = 160#B
Every time poke starts, it reads ~/.pokerc and executes the commands contained in it. See pokerc.
The dump
command is very flexible, and accepts a lot of
options and customization variables that we won’t be covering in this
chapter. For a complete description of the command, see dump.
Note that this is not a
dot-command like .file
, .ios
or .close
:
dump
does not start with a dot! We will see later how
dot-commands differ from “normal commands” like dump
, but
for now, let’s ignore the distinction.
Do not be fooled by the
fact dump
shows the hexadecimal digits in groups of four:
this is just a visual aid and, as we shall see, it is possible to
change the grouping by passing arguments to dump
.
Next: Poking Bytes, Previous: Files as IO Spaces, Up: Basic Editing [Contents][Index]