_____
---' __\_______
______)
__) Release notes for poke 2.0
__)
---._______)
I am happy to announce a new major release of GNU poke, version 2.0.
This release is the result of a year of development. A lot of things
have changed and improved with respect to the 1.x series; we have
fixed many bugs and added quite a lot of new exciting and useful
features. See below for a description of many of them.
We have had lots of fun and learned quite a lot in the process; we
really wish you will have at least half of that fun using this tool!
The tarball poke-2.0.tar.gz is now available at
https://ftp.gnu.org/gnu/poke/poke-2.0.tar.gz.
GNU poke (http://www.jemarch.net/poke) is an interactive, extensible
editor for binary data. Not limited to editing basic entities such
as bits and bytes, it provides a full-fledged procedural,
interactive programming language designed to describe data
structures and to operate on them.
Thanks to the people who contributed with code and/or documentation to
this release. In certain but no significant order they are:
Mohammad-Reza Nabipoor
Luca Saiu
Bruno Haible
Egeyar Bagcioglu
David Faust
Guillermo E. Martinez
Konstantinos Chasialis
Matt Ihlenfield
Thomas Weißschuh
Sergei Trofimovich
Fangrui Song
Indu Bhagat
Jordan Yelloz
Morten Linderud
Sergio Durigan Junior
As always, thank you all!
User interface updates
~~~~~~~~~~~~~~~~~~~~~~
- Types can now be redefined at the prompt. This allowed us to
change `load' to always load the requested file/module.
- The poke command now accepts two additional command-line options in
order to select which default style to use: --style-dark and
--style-bright. The first, which is the default, works good with
dark backgrounds. The second works good with bright backgrounds.
- Invoking the .set command with no arguments now lists all the global
settings along with their values. It also emits buttons (terminal
hyperlinks) to easily toggle boolean settings.
- The .file dot-command now supports a /c flag, that tells poke it
must create an empty file with the specified name if it doesn't
already exist.
- The :size argument of the dump command is now rounded up to the next
byte, not truncated down to the previous byte.
- A new "sub" IO space has been implemented in poke. It allows to
create IO spaces that are like narrowed versions of some other IO
space. These "sub" IO spaces are created using the new dot-command
.sub:
,----
| (poke) .mem scratch
| (poke) .info ios
| Id Type Mode Bias Size Name
| * #0 MEMORY rw 0x00000000#B 0x00001000#B *scratch*
| (poke) .sub #0, 2, 16, lala
| (poke) .info ios
| Id Type Mode Bias Size Name
| * #1 SUB rw 0x00000002#B 0x00000010#B sub://0/0x0/0x10/lala
| #0 MEMORY rw 0x00000000#B 0x00001000#B *scratch*
`----
At this point accessing the IO space #1 at offset 0#B will modify
the data in the *scratch* IO space at offset 2#B.
- GNU poke can now poke at the memory of a running process using the
new process IOS. This can be done using the new dot-command .proc:
,----
| (poke) .proc 30244
| (poke) .info ios
| Id Type Mode Bias Size Name
| * #0 PROC rw 0x00000000#B 0xffffffffffffffff#B pid://30244
`----
The command above has opened a new IO space (with id #0) to poke at
the memory of the running process with PID 30244.
A flag /m can be passed to .proc to indicate we want poke to create
additional sub-spaces providing access to the mapped VM ranges of
the process:
,----
| (poke) .proc/m 30244
| (poke) .info ios
| Id Type Mode Bias Size Name
| #9 SUB r 0x00000000#B 0x00001000#B sub://0/0xffffffffff600000/0x1000/[vsyscall]
| #8 SUB r 0x00000000#B 0x00002000#B sub://0/0x7ffe82db2000/0x2000/[vdso]
| #7 SUB r 0x00000000#B 0x00002000#B sub://0/0x7ffe82db0000/0x2000/[vvar]
| #6 SUB rw 0x00000000#B 0x00021000#B sub://0/0x7ffe82c2f000/0x21000/[stack]
| [...]
| * #0 PROC rw 0x00000000#B 0xffffffffffffffff#B pid://30244
`----
See below for the new handler syntax in open to open process IO
spaces from Poke programs. Note that the support for process IO
space is only implemented in GNU/Linux systems.
- The dump command now shows ?? marks to denote bytes that are not
readable, for whatever reason. This happens for example when we try
to access some non-mapped area of the VM space of a process:
,----
| (poke) .proc/m 30244
| (poke) dump :from 0#B :size 16#B
| 76543210 0011 2233 4455 6677 8899 aabb ccdd eeff 0123456789ABCDEF
| 00000000: ???? ???? ???? ???? ???? ???? ???? ???? ................
`----
- The dump command now emits hyperlinks for the shown bytes, if
requested. Clicking on them will insert the offset where the byte
resides in the poke prompt.
- The dump command now remembers the last used offset per IO space.
In poke 1.x it just remembered the last globally used offset, which
was very confusing.
- The .info ios dot-command now emits [close] buttons if hyperlinks
are enabled. Clicking on them closes the referred IO space.
- The .info ios dot-command now shows the value of the "bias" for each
open IO space. See below for more information on this bias.
- E_constraint exceptions raised when mapping now include some useful
location information:
,----
| (poke) Elf64_Ehdr @ 1#B
| unhandled constraint violation exception
| constraint expression failed for field Elf_Ident.ei_mag
`----
- The .info types, .info variables and .info functions dot-commands
now accept an optional regular expression. Example:
,----
| (poke) load elf
| (poke) .info types Elf64_
| Name Declared at
| Elf64_Ehdr elf-64.pk:184
| Elf64_Off elf-64.pk:26
| Elf64_SectionFlags elf-64.pk:130
| Elf64_Shdr elf-64.pk:152
| Elf64_RelInfo elf-64.pk:36
| [...]
`----
When listing types, the names in the first column are emitted with a
nice terminal hyperlink. Clicking on them will result in executing
.info type NAME.
- New dot-command .info type NAME, that prints out a nice and
informative description of the type with the given name. Example:
,----
| (poke) .info type Elf64_File
| Class: struct
| Name: "Elf64_File"
| Complete: no
| Fields:
| ehdr
| shdr
| phdr
| Methods:
| get_section_name
| get_symbol_name
| get_sections_by_name
| get_sections_by_type
| section_name_p
| get_string
| get_group_signature
| get_group_signatures
| get_section_group
`----
Poke Language updates
~~~~~~~~~~~~~~~~~~~~~
- Integral structs are struct values that are stored like integers.
This is a simple real-life example:
,----
| type Elf_Sym_Info =
| struct uint<8>
| {
| uint<4> st_bind;
| uint<4> st_type;
| };
`----
These structs can be used pretty much like regular structs,
accessing their fields normally. In poke 1.x it was already
possible to "integrate" them, i.e. operating with their "integer"
value by using either an explicit cast:
,----
| (poke) Elf_Sym_Info { st_bind = 1 } as uint<8>
| 0x10UB
`----
or automatically, using them in a context where an integral value is
expected:
,----
| (poke) Elf_Sym_Info { st_bind = 1} + 1
| 0x11UB
`----
We are now introducing support for the inverse operation: to
"deintegrate" an integer into an integral struct value. This is
performed by a cast:
,----
| (poke) 0x10 as Elf_Sym_Info
| Elf_Sym_Info { st_bind=0x1UB, st_type=0x0UB }
`----
- Arrays whose elements are integral (integers, other integral arrays,
or integral structs) can now be also "integrated" and "deintegrated"
using casts:
,----
| (poke) [1UB, 0UB] as int
| 0x10
| (poke) 0x10 as uint<8>[2]
| [1UB,0UB]
| (poke) [[1UB,2UB],[3UB,4UB]] as int
| 0x1234
| (poke) 0x1234 as uint<8>[2][2]
| [[1UB,2UB],[3UB,4UB]]
`----
This nice feature has been contributed by Mohammad Reza-Nabipoor.
- Poke programs can now sleep for a number of seconds and nanoseconds
using the new built-in function sleep. It has the following
signature:
,----
| fun sleep = (int<64> sec, int<64> nsec = 0) void:
`----
The sleep is not active, i.e. poke will not consume CPU while it is
sleeping.
- The only way in poke 1.x to print styled text to the terminal was to
use the very ugly and clumsy %<..> tags in printf statements.
This was annoying to use, especially because you were required to
end every styling class you open in the same statement. So we added
a couple of new built-in functions term_begin_class and
term_end_class. They are used like this:
,----
| term_begin_class ("error");
| print "error: the quux is foobared";
| term_end_class ("error");
`----
The functions will raise an exception in case you nest your classes
the wrong way.
- It is now possible to emit terminal hyperlinks from Poke programs.
See "Terminal hyperlinks updates" below for more information on
this.
- The equality `==' and inequality `!=' operators now work on function
values. Two given function values are equal if they are the same
function value. i.e. given these definitions:
,----
| fun foo = void: {}
| fun bar = void: {}
`----
In this example foo is equal to foo, but foo is not equal to bar
even if both functions happen to have identical type signature and
bodies.
- Now it is a compile-time error to cast any values to function types.
We plan to support this at some point, since it is needed for things
like having any fields in structs, but at the moment closure values
are not tagged with their type at run-time and it is better for the
user to get compile-time errors.
- In this release we have changed both the semantics and syntax of
struct type field initializers. This change has been motivated by
our own practical usage of poke. In poke 1.x a struct field
initializer had the form:
,----
| type Foo =
| struct
| {
| int i = 10;
| long l;
| };
`----
The initialization of i had two effects: newly constructed Foo
struct values would have i initialized to 10, and a constraint that
i must equal 10 was also implied when mapping structs of type Foo.
This worked well, but we realized it is good to decouple the
implicit constraint from the initialization value: sometimes you
need one of these, but not both. It was also not possible to add
additional constraints. So we changed the semantics of the
construction above to denote "initialize to 10, but no implicit
constraint", with maybe an additional constraint like in:
,----
| type Foo =
| struct
| {
| int i = 10 : i < 100;
| long l;
| };
`----
And then we added the new syntax:
,----
| type Foo =
| struct
| {
| int i == 10;
| long l;
| };
`----
to denote "initialize to 10, plus implicit constraint i == 10."
- The Poke language is quite exception-oriented. A good example of
this is the way to detect whether a given alternative in an union is
the currently selected one: refer to it and see if an E_elem
exception gets raised. This in practice leads to code like this:
,----
| try { length.indefinite; return 1; }
| catch if E_elem { return 0; }
`----
This is so common that we have introduced a new "exception
conditional" operator ?!. The above code can now be rewritten as:
,----
| return length.indefinite ?! E_elem;
`----
The new operator also has a { ... } ?! EXCEPTION form where the code
to execute is a compound statement.
- Constraints in struct type fields are very often related to the
values of other fields in the same struct. The relationship is
often in the form "if field X has value N, then I should have a
value M". We have added a new "logical implication" operator =>
(inspired from the recutils operator with the same name) that
implements this logic:
,----
| A => B :=: !A || (A && B)
`----
For example:
,----
| uint<1> encoding;
| uint<5> tag_number : tag_number == BER_TAG_REAL => encoding == 1;
`----
Meaning that if tag_number denotes a "real" then encoding must be 1.
If tag_number does not denote a "real" then the value of encoding is
irrelevant.
- Mohammad-Reza Nabipoor has contributed support for a new built-in
function format, that is able to format a string out of a format
string and a list of values. For example:
,----
| var s = format ("%s - %s (%i32d)", name, sex, age);
`----
Will format a string in s like:
,----
| "Francisco Maganto - male (63)"
`----
The format string is identical to the one used by printf, and
therefore it also accepts %v tags to format nested complex values
like arrays and structs.
- Exception structs have been expanded to include two additional
fields: location and msg. Both new fields are strings, and are used
to convey location information (like the field whose constraint
expression failed resulting in an E_constraint) and an explanation
of why the exception was raised, respectively. There is no more
need to abuse the field name for these purposes.
- Much like in C, the Poke printf format strings use % to introduce
formatting tags, like %i32d. In poke 1.x it was embarrassingly not
possible to denote the character % itself. Now %% can be used in
printf format strings in order to denote a single % character.
- If a struct type has labels and all of them are constant, it is now
considered a complete type. i.e. a type whose size is known at
compile-time. This makes it possible to use it in sizeof or as the
unit of an offset.
- The Poke Virtual Machine settings that configure output are now
accessible programmatically from Poke using a set of new built-ins:
,----
| vm_{obase,opprint,oacutoff,odepth,oindent,omaps,omode}
| vm_set_{obase,opprint,oacutoff,odepth,oindent,omaps,omode}
`----
- We added introspection capabilities: typeof. XXX.
Standard Poke Library updates
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
- The atoi standard function has been improved in order to parse
integers with signs.
- The ltos standard function now works for any numeration base from 1
to 16, included.
- New standard functions opensub and openproc have been introduced to
help with the creation of "sub" and process IO spaces, respectively.
See below under "IO subsystem updates" for more information.
- A new standard function exit has been added to the standard library,
that provides a familiar way to exit a Poke program with an optional
exit code:
,----
| fun exit = (int<32> exit_code 0)
`----
This is just a wrapper to the more "pokeish" and less conventional
way of exiting, which is to raise an exception of type EC_exit:
,----
| raise Exception { code = EC_exit, exit_status = exit_code };
`----
- In poke 1.x we forgot to emit an error at compile-time when the user
defined an union type as pinned. This obviously doesn't make any
sense (all the alternatives of an union are "pinned") so now a
compile-time error is raised.
- We have realized that it doesn't make much sense for pinned struct
types (whose fields all have an implicit offset of zero bytes) to
have field labels. To avoid confusing the user, the compiler now
emits a compile-time error if such a type is defined.
- Unions can no longer have alternatives with labels. This worked in
poke 1.x, but had complicated semantics with little real practical
benefit. So now a compile-time error is raised if such a type is
defined.
IO subsystem updates
~~~~~~~~~~~~~~~~~~~~
- Each open IO space now maintains a "bias", which is a bit-offset.
This bias is added to the offset specified in every read or write
access to the IO space. This bias is programmable, and can be read
and set by Poke programs using two new built-in calls:
,----
| fun iobias = (int<32> ios = get_ios)
| fun iosetbias = (offset<uint<64,b> bias = 0#b, int<32> ios = get_ios)
`----
Note how the positioning and default values of these functions makes
it comfortable to use them at the (poke) prompt.
- Poke programs can now poke at the standard input, standard output
and standard error output of the running poke process using the new
stream IO spaces.
The new available handlers (to be used with open) are <stdin>,
<stdout> and <stderr>. A good application of these IO
spaces is to write filter utilities in Poke.
This support has been contributed by Egeyar Bagcioglu.
- The IOS_F_* flags used in the open builtin has been rationalized.
As a consequence the IOS_F_TRUNCATE flag has been removed.
- There is a new built-in ioflags that returns the flags used to open
some particular IO space.
- open now raises an E_perm exception if the user doesn't have enough
credentials when opening an IO space. This can happen for example
with insufficient permissions when opening a file in some specified
mode.
- Opening an IO space using a handler <zero> now provides an IO
space of size 2^64 bytes whose contents are all zero bytes, and that
ignores any writes to it. Example:
,----
| var zeroes = open ("<zero>");
`----
- New IO device sub for sub-spaces. Sub IO spaces that act like a
(maybe) narrower version of some other given IO space can be opened
using a sub:// handler. Example:
,----
| var sub = open ("sub://2/0x10/0x1000/somename", IOS_M_RDONLY);
`----
This will create a sub-space that provides read-only access to the
[0x10#B,0x1010#B] range of bytes of some other IO space with id #2.
Since it can be annoying to format the handler string, a convenient
utility function called opensub has been added to the standard
library:
,----
| var sub = opensub (ios, 0x10#B, 0x1000#B, "somename", IOS_M_RDONLY);
`----
- New IO device proc for poking at the memory of live processes. The
handler to pass to open in order to poke at the memory of a process
with a given PID has the form pid://PID. Example:
,----
| var fd = open ("pid://1234");
`----
Since it can be annoying to format the handler string, a convenient
utility function called openproc has been added to the standard
library:
,----
| var fd = openproc (1234);
`----
Terminal hyperlinks updates
~~~~~~~~~~~~~~~~~~~~~~~~~~~
- The payload used in the terminal hyperlinks URL is now reduced to a
token number. This makes it more opaque and compact than the
previous version where we would encode full Poke expressions in the
URL.
- The "hyperserver", poke utility subsystem that provides and handles
the support for "terminal hyperlinks", has been partially rewritten.
The new implementation is mostly written in Poke. This makes it
possible for Poke programs (like pickles) to emit and handle
terminal hyperlinks. For example:
,----
| var url = hserver_make_hyperlink ('e', "2 + 2");
|
| term_begin_hyperlink (url, "");
| print ("[clickme]");
| term_end_hyperlink;
`----
Will print a clickable button "[clickme]" in the terminal, which
once clicked on will execute the Poke code "2 + 2".
- poke 1.x supported two variants of terminal hyperlinks: "execute"
and "insert". Clicking on an "execute" hyperlink triggers the
execution of some given textual Poke expression or statement.
Clicking on an "insert" hyperlink triggers the inclusion of some
given string at the current input position at the prompt. We now
introduced a third kind of hyperlinks: the "closure" hyperlinks.
Once clicked, some given Poke closure/function gets executed. This
is particularly useful when generating hyperlinks from a Poke
program. For example:
,----
| fun toggle_setting = void: { ... };
|
| term_begin_hyperlink (hserver_make_hyperlink ('c', "", (toggle_setting)), "");
| print "[toggle]";
| term_end_hyperlinks;
`----
will print a clickable button "[toggle]" in the terminal, which once
clicked on will execute the toggle_setting function.
libpoke updates
~~~~~~~~~~~~~~~
- poke now installs a pkg-config file poke.pc to ease using libpoke.
- libpoke now provides an interface to register foreign IO spaces.
This allows adding pokeish capabilities to third-party programs. As
an example, poke integration in GDB has been written (not upstreamed
yet) that makes it possible to poke the memory of an inferior
process being debugged. This is achieved by registering callbacks
in the foreign IO interface.
- libpoke now provides an inteface to register "alien tokens". This
is also useful when integrating libpoke in third-party applications.
For example, in the GDB integration this interface has been used to
allow users to refer to the value of GDB symbols in Poke programs
using this syntax:
,----
| $main
`----
and to the address of a given symbol using this syntax:
,----
| $addr::main
`----
- It is now possible to create an incremental compiler without
standard types. This is useful in cases where the program
integrating with libpoke provides its own concept of types like int
or long. This is the case of GDB.
- We added more services in libpoke to operate on PK values. Many
more still need to be added.
- libpoke.h is now C++ compatible.
Pickles updates
~~~~~~~~~~~~~~~
- New pickle asn1-ber.pk provides definitions to poke ASN-1 data
encoded in BER (Basic Encoding Rules.)
- New pickles ustar.pk provides definitions to poke USTAR file
systems, standardized by POSIX.1-1988 and POSIX.1-2001.
- New pickles ctf-dump.pk provides functions that dump the data of CTF
sections in a human-readable form. Contributed by Indu Bhagat on
behalf of Oracle Inc.
- New pickles jffs2.pk provides definitions to poke JFFS2 file
systems.
- The elf.pk pickle has been splitted in 32-bit and 64-bit variants,
and ELF-32 definitions have been added.
- The output messages emitted by argp.pk have been improved.
- New tests for the btf.pk pickle. Contributed by David Faust on
behalf of Oracle Inc.
Utilities updates
~~~~~~~~~~~~~~~~~
- New filter pk-strings.pk.
- New filter pk-bin2poke.pk.
Development tools updates
~~~~~~~~~~~~~~~~~~~~~~~~~
- RAS (the retarded poke assembler) now allows to split long logical
lines into several physical lines by finishing them with the
backslash character. This is an example:
,----
| .macro struct_field_extractor @struct_type @field @struct_itype \
| @field_type #ivalw #fieldw
`----
- RAS now understands the endianness specifiers IOS_ENDIAN_LSB and
IOS_ENDIAN_MSB. Better than hardcoded magic numbers.
- RAS now emits an error if it finds a .function in which there is not
exactly one prolog instruction and at least one return instruction.
This problem has bitten me very often, resulting in very puzzling
and subtle bugs, glglgl.
Notable bug fixes
~~~~~~~~~~~~~~~~~
- The infamous, annoying, hated and despised "PVM_VAL_CLS_ENV
(closure) != NULL" bug has (most probably... crossing fingers, toes
and whatnot) been finally fixed in this release. I have tried no
less than eight times to fix this bug, but not until very recently I
finally understood what was going on; I thought I did and the test
case would work, but no I didn't and the bug would soon return with
renovated hatred. On my defense, the involved area is the hairiest
part of the code generator and every time I look at it I basically
have to re-learn how it works. Good news is: if the same problem
manifests again, now we know how to fix it. But hopefully it is
fixed now and I will stop receiving hate email about this.
- --disable-hserver now actually disables the hyperserver. Yes,
really.
Documentation updates
~~~~~~~~~~~~~~~~~~~~~
- We have added a new online help system to the poke utility, which is
accessible using the .help dot-command. There are help entries for
all the commands, dot-commands and global settings. This new online
help system is written in Poke, making it possible to other Poke
programs (like pickles) to add their own help topics to the system.
- We have expanded the user manual to cover the new functionality and
clarify obscure concepts that weren't explained with enough clarity.
Despite of this work, the user manual is not yet complete,
unfortunately, but getting there.
- The user manual now contains a section that explains how to set up
your system and terminal emulators to support poke terminal
hyperlinks.
- There is a new website called Pokology, https://pokology.net,
which is maintained by the poke developers and users. It is a live
repository of knowledge relative to GNU poke, including practical
articles, multimedia stuff, hints and tricks, a frequently asked
question, etc.
Editor support updates
~~~~~~~~~~~~~~~~~~~~~~
- The Emacs modes poke-mode, poke-ras-mode and poke-map-mode have been
update with more features and bug fixes.
- We now distribute a module for Poke syntax highlighting for vim.
Contributed by Matthew T. Ihlenfield.
Other updates
~~~~~~~~~~~~~
- We have rewritten the way global settings (like endianness,
numeration base used for output, etc) are handled in both libpoke
and the poke application. The new implementation introduces a
"settings registry" which is written itself in Poke. As such, other
Poke code (like pickles and the like) have full access to the
settings registry, and even add their own settings to the
application.
- Many improvements in compiler error messages. This includes
emitting more meaningful messages, for example "got int<32>
expected string" instead of "invalid operands in expression". And
also better locations, like pointing at the particular operand whose
type is wrong instead of the entire expression.
Happy poking!
--
Jose E. Marchesi
Frankfurt am Main
28 February 2022