_____
---' __\_______
______) Array boundaries and closures in Poke
__)
__)
---._______)
Jose E. Marchesi
October 03, 2019
Poke arrays are rather peculiar. One of their seemingly bizarre
characteristics is the fact that the expressions calculating their
boundaries (when they are bounded) evaluate in their own lexical
environment, which is captured. In other words: the expressions
denoting the boundaries of Poke arrays conform closures. Also, the
way they evaluate may be surprising. This is no capricious.
There are three different kind of array types in Poke.
"Unbounded" arrays have no explicit boundaries. Examples are 'int[]'
or 'Elf64_Shdr[]'. Arrays can be bounded by <b>number of elements</b>
specifying a Poke expression that evaluates to an integer value. For
example, 'int[2]'. Finally, arrays can be bounded by <b>size</b>
specifying a Poke expression that evaluates to an offset value. For
example, 'int[8#B]'.
When an array type is bounded, be it by number of elements or by size,
the expression indicating the boundary doesn't need to be constant and
it can involve variables. For example, consider the following type
definition:
var N = 2
type List = int[N*2]
Let's map a 'List' at some offset:
(poke) List @ 0#B
[0x746f6f72,0x303a783a,0x723a303a,0x3a746f6f]
As expected, we get an array of four integers. Very good, obviously
the boundary expression 'N*2' got evaluated when defining the type
'List', and the result of the evaluation was '4', right?. Typical
semantics like in my garden variety programming language... right?
Right?!?
Well, not really. Let's modify the value of 'N' and map a 'List'
again...
(poke) N = 1
(poke) List @ 0#B
[0x746f6f72,0x303a783a]
Yes, The boundary of the array type changed... come on, this is Poke,
was you _really_ expecting something typical? :)
What happens is that at type definition time the lexical environment
is captured and a closure is created. The body of the closure is the
expression. Every time the type is referred, the closure is
re-evaluated and a new value is computed.
Consequently, if the value of a variable referred in the expression
changes, like in our example, the type itself gets updated
automagically. Very nice but, why is Poke designed like this? Just
to impress the cat? Nope.
In binary formats, and also in protocols, the size of some given data
is often defined in terms of some other data that should be decoded
first. Consider for example the following definition of a 'Packet':
type Packet =
struct
{
byte size;
byte[size] payload;
};
Each packet contains a 8-bit integer specifying the size of the
payload transported in the packet. The payload, a sequence of 'size'
bytes, follows.
In struct types like the above, the boundaries of arrays depend on
fields that have been decoded before and that exist, like variables,
in the lexical scope captured by the struct type definition (yes,
these are also closures, but that's for another article.) This
absolutely depends on having the array types evaluate their bounding
expressions when the type is used, and not at type definition time.
To show this property in action, let's play a bit:
(poke) var data = byte[4] @ 0#B
(poke) data[0] = 2
(poke) data[1] = 3
(poke) data[2] = 4
(poke) data[3] = 5
(poke) dump
76543210 0011 2233 4455 6677 8899 aabb ccdd eeff
00000000: 0203 0405 0000 0000 0000 0000 0000 0000
00000010: 0000 0000 0000 0000 0000 0000 0000 0000
(poke) var p1 = Packet @ 0#B
(poke) var p2 = Packet @ 1#B
(poke) p1
Packet {size=0x2UB,payload=[0x3UB,0x4UB]}
(poke) p2
Packet {size=0x3UB,payload=[0x4UB,0x5UB,0x0UB]}
Now, let's change the data and see how the sizes of the payloads are
adjusted accordingly:
(poke) data[0] = 1
(poke) data[1] = 0
(poke) p1
Packet {size=0x1UB,payload=[0x0UB]}
(poke) p2
Packet {size=0x0UB,payload=[]}
So, as we have seen, Poke's way of handling boundaries in array types
allows data structures to adjust to the particular data they contain,
so usual in binary formats. This is an important feature, that gives
Poke part of it's feel and magic.
Happy poking! :)