Applied Pokology

A blog about GNU poke
Back to index

Understanding Poke methods
[04-05-2020]

by Jose E. Marchesi

Poke struct types can be a bit daunting at first sight. You can find all sort of things inside them: from fields, variables and functions to constraint expressions, initialization expressions, labels, other type definitions, and methods.

Struct methods can be particularly confusing for the novice poker. In particular, it is important to understand the difference between methods and regular functions defined inside struct types. This article will hopefully clear the confusion, and also will provide the reader with a better understanding on how poke works internally.

[the Packet]

First we need to define some structure to use as an example. Let's say we are interesting in poking Packets, as defined by the Packet Specification 1.2 published by the Packet Foundation (none less).

In a nutshell, each Packet starts with a byte whose value is always 0xab, followed by a byte that defines the size of the payload. A stream of bytes conforming the payload follows, themselves followed by another stream of the same number of bytes with "control" values.

We could translate this description into the following Poke struct type definition:

deftype Packet =
  struct
  {
    byte magic = 0xab;
    byte size;
    byte[size] payload;
    byte[size] control;
  };

See the Poke manual for details on types, initialization values, constraint expressions etc.

There are some details described the Packet Specification 1.2 that are not covered in this simple definition, but we will be attending to that later in this article.

[the process of building structs]

Given the definition of a struct type like Packet, there are only two ways to build a struct value in Poke.

One is to map it from some IO space. This is achieved using the map operator:

(poke) Packet @ 12#B
Packet {
  magic = 0xab,
  size = 2,
  payload = [0x12UB,0x30UB],
  control = [0x1UB,0x1UB]
}

The expression above maps a Packet starting at offset 12 bytes, in the current IO space. See the Poke manual for more details on using the map operator.

The second way to build a struct value is to _construct_ one, specifying the value to some, all or none of its fields. It looks like this:

(poke) Packet {size = 2, payload = [1UB,2UB]}
Packet {
  magic = 0xab,
  size = 2,
  payload = [0x1UB,0x2UB],
  control = [0x0UB,0x0UB]
}

In either case, building a struct involves to determine the value of all the fields of the struct, one by one. The order in which the struct fields are built is determined by the order of appearance of the fields in the type description.

In our example, the value of magic is determined first, then size, payload and finally control. This is the reason why we can refer to the values of previous fields when defining fields, such as in the size of the paylod array above, but not the other way around: by the time payload is mapped or constructed, the value of size has already been mapped or constructed.

What happens behind the curtains is that when poke finds the definition of a struct type, like Packet, it compiles two functions from it: a mapper function, and a constructor function. The mapper function gets as arguments the IO space and the offset from which to map the struct value, whereas the constructor function gets the template specifying the initial values for some, or all of the fields; reasonable default values (like zeroes) are used for fields for which no initial values have been specified.

These functions, mapper and constructor, are invoked to create fresh values when a map operator @ or a struct constructor is used in a Poke program, or at the poke prompt.

[variables in struct types]

Fields are not the only entity that can appear in the definition of a struct type.

Suppose that after reading more carefully the Packet Specification 1.2 (that spans for several thousand of pages) we realize that the field size doesn't really stores the number of bytes of the payload and control arrays, like we thought initially. Or not exactly: the Packet Foundation says that if size has the special value 0xff, then the size is zero.

We could of course do something like this:

deftype Packet =
  struct
  {
    byte magic = 0xab;
    byte size;

    byte[size == 0xff ? 0 : size] payload;
    byte[size == 0xff ? 0 : size] control;
  };

However, we can avoid replicating code by using a variable instead:

deftype Packet =
  struct
  {
    byte magic = 0xab;
    byte size;

    defvar real_size = (size == 0xff ? 0 : size);

    byte[real_size] payload;
    byte[real_size] control;
  };

Note how the variable can be used after it gets defined. In the underlying process of mapping or constructing the struct, the variable is incorporated into the lexical environment. Once defined, it can be used in constraint expressions, array sizes, etc. We will see more about this later.

Incidentally, it is of course possible to use global variables as well. For example:

defvar packet_special = 0xff;
deftype Packet =
  struct
  {
    byte magic = 0xab;
    byte size;

    defvar real_size = (size == packet_special ? 0 : size);

    byte[real_size] payload;
    byte[real_size] control;
  };

In this case, the global packet_special gets captured in the lexical environment of the struct type (in reality in the lexical environment of the implicitly created mapper and constructor functions) in a way that if you later modify packet_special the new value will be used when mapping/constructing _new_ values of type Packet. Which is really cool, but lets not get distracted from the main topic... :)

[functions in struct types]

Further reading of the Packet Specification 1.2 reveals that each Packet has an additional crc field. The content of this field is derived from both the payload bytes and the control bytes.

But this is no vulgar CRC we are talking about. On the contrary, it is a special function developed by the CRC Foundation in partnership with the Packet Foundation, called superCRC (patented, TM).

Fortunately, the CRC Foundation distributes a pickle supercrc.pk, that provides a calculate_crc function with the following spec:

defun calculate_crc = (byte[] data, byte[] control) int:

So let's use the function like this in our type, after loading the supercrc pickle:

load supercrc;

deftype Packet =
  struct
  {
    byte magic = 0xab;
    byte size;

    defvar real_size = (size == 0xff ? 0 : size);
 
    byte[real_size] payload;
    byte[real_size] control;

    int crc = calculate_crc (payload, control);
  };

However, there is a caveat: it happens that the calculation of the CRC may involve arithmetic and division, so the CRC Foundation warns us that the calculate_crc function may raise E_div_by_zero. However, the Packet 1.2 Specification tells us that in these situations, the crc field of the packet should contain zero. If we used the type above, any exception raised by calculate_crc would be propagated by the mapper/constructor:

(poke) Packet @ 12#B
unhandled division by zero exception

A solution is to use a function that takes care of the extra needed logic, wrapping calculate_crc:

load supercrc;

deftype Packet =
  struct
  {
    byte magic = 0xab;
    byte size;

    defvar real_size = (size == 0xff ? 0 : size);

    byte[real_size] payload;
    byte[real_size] control;

    defun corrected_crc = int:
    {
      try return calculate_crc (payload, control);
      catch if E_div_by_zero { return 0; }
    }
         
    int crc = corrected_crc;
  };

Again, note how the function is accessible after its definition. Note as well how both fields and variables and other functions can be used in the function body. There is no difference to define variables and functions in struct types than to define them inside other functions or on the top-level environment: the same lexical rules apply.

[methods]

At this point you may be thinking something on the line of "hey, since variables and functions are also members of the struct, I should be able to access them the same way than fields, right?".

So you will want to do:

(poke) defvar p = Packet @ 12#B
(poke) p.real_size
(poke) p.corrected_crc

But sorry, this won't work.

To understand why, think about the struct building process we sketched above. The mapper and constructor functions are derived/compiled from the struct type. You can imagine them to have prototypes like:

Packet_mapper (IOspace, offset) -> Packet value
Packet_constructor (template) -> Packet value

You can also picture the fields, variables and functions in the struct type specification as being defined inside the bodies of Packet_mapper and Packet_constructor, as their contents get mapped/constructed. For example, let's see what the mapper does:

Packet_mapper:

  . Map a byte, put it in a local `magic'.
  . Map a byte, put it in a local `size'.
  . Calculate the real size, put it in a local `real_size'.
  . Map an array of real_size bytes, put it in a local `payload'.
  . Map an array of real_size bytes, put it in a local `control'.
  . Compile a function, put it in a local `corrected_crc'.
  . map a byte, call the function in the local `corrected_crc',
    complain if the values are not the same, otherwise put the
    mapped byte in a local `crc'.
  . Build a struct value with the values from the locals `magic',
    `size', `payload', `control' and `crc', and return it.

The pseudo-code for the constructor would be almost identical. Just replace "map a byte" with "construct a byte".

So you see, both the values for the mapped fields and the values for the variables and functions defined inside the struct type end as locals of the mapping process, but only the values of the fields are actually put in the struct value that is returned in the last step.

This is where methods come in the picture. A method looks very similar to a function, but it is not quite the same thing. Let me show you an example:

load supercrc;

deftype Packet =
  struct
  {
    byte magic = 0xab;
    byte size;

    defvar real_size = (size == 0xff ? 0 : size);

    byte[real_size] payload;
    byte[real_size] control;

    defun corrected_crc = int:
    {
      try return calculate_crc (payload, control);
      catch if E_div_by_zero { return 0; }
    }
         
    int crc = corrected_crc;

    method c_crc = int:
    {
      return corrected_crc;
    }
  };

We have added a method c_crc to our Packet struct type, that just returns the corrected superCRC (patented, TM) of a packet. This can be invoked using dot-notation, once a Packet value is mapped/constructed:

(poke) defvar p = Packet @ 12#B
(poke) p.c_crc
0xdeadbeef

Now, the important bit here is that the method returns the corrected crc of a Packet. That's it, it actually operates on a Packet value. This Packet value gets implicitly passed as an argument whenever a method is invoked.

We can visualize this with the following "pseudo Poke":

method c_crc = (Packet SELF) int:
{
   return SELF.corrected_crc;
}

Fortunately, poke takes care to recognize when you are referring to fields of this implicit struct value, and does The Right Thing(TM) for you. This includes calling other methods:

method foo = void: { ... }
method bar = void:
{
 [...]
 foo;
}

The corresponding "pseudo-poke" being:

method bar = (Packet SELF) void:
{
 [...]
 SELF.foo ();
}

It is also possible to define methods that modify the contents of struct fields, no problemo:

defvar packet_special = 0xff;

deftype Packet =
  struct
  {
    byte magic = 0xab;
    byte size;
    [...]

    method set_size = (byte s) void:
    {
      if (s == 0)
        size = packet_special;
      else         
        size = s;
    }
  };

This is what is commonly known as a "setter". Note, incidentally, how a method can also use regular variables. The Poke compiler knows when to generate a store in a normal variable such as packet_special, and when to generate a set to a SELF field.

[a few restrictions]

Given the different nature of the variables, functions and methods, there are a couple of restrictions:

Happy poking! :)

Follow up in the mailing list...

Jose E. Marchesi - http://jemarch.net/