API Reference#
Overview#
This describes the API available from binary-data. It is organized in the following sections, depending on different demands: using binary data in a tool, extending with a custom binary format, and the internal API.
Class hierarchy#
The class hierarchy is rooted in <frame>
. Several direct
subclasses exist, some are open
and may be subclassed. Only those
combinations of direct subclasses which were needed until now are
defined (there might be need for other combinations in the future).
- <frame> Abstract Class#
- Discussion:
The abstract superclass of all frames, several generic functions are defined on this class.
- Superclasses:
- Operations:
- <leaf-frame> Abstract Class#
- Discussion:
The abstract superclass of all frames without any further structure.
- Superclasses:
- Operations:
- <fixed-size-frame> Abstract Class#
- Discussion:
The abstract superclass of all frames with a static length. The specialization of
frame-size
callsfield-size
on the object class of the given instance.- Superclasses:
- <variable-size-frame> Abstract Class#
- Discussion:
The abstract superclass of all frames with a variable length.
- Superclasses:
- <translated-frame> Abstract Class#
- Discussion:
The abstract superclass of all frames with a conversion into a native Dylan type.
- Superclasses:
- <untranslated-frame> Abstract Class#
- Discussion:
Abstract superclass of all frames with a custom class instance.
- Superclasses:
- <fixed-size-untranslated-frame> Abstract Class#
- Discussion:
Abstract superclass for fixed sized frames without a translation
- Superclasses:
- <variable-size-untranslated-frame> Abstract Class#
- Discussion:
Abstract superclass for variable sized frames without a translation. This is the direct superclass of
<container-frame>
.- Superclasses:
- <fixed-size-translated-leaf-frame> Open Abstract Class#
- Discussion:
Superclass of all fixed size leaf frames with a translation, mainly used for bit vectors represented as Dylan
<integer>
- Superclasses:
- <variable-size-translated-leaf-frame> Open Abstract Class#
- Discussion:
Superclass of all variable size leaf frames with a translation (currently unused)
- Superclasses:
- <fixed-size-untranslated-leaf-frame> Open Abstract Class#
- Discussion:
Superclass of all fixed size leaf frames without a translation, mainly used for byte vectors (IP addresses, MAC address, …), see its subclass
<fixed-size-byte-vector-frame>
.- Superclasses:
- <variable-size-untranslated-leaf-frame> Open Abstract Class#
- Discussion:
Superclass of all variable size leaf frames without a translation (for example class
<raw-frame>
and class<externally-delimited-string>
)- Superclasses:
- <null-frame> Class#
- Discussion:
A concrete zero size leaf frame without a translation. This frame type can be used as one of the types of a variably-typed field to make the field optional. A field with a type <null-frame> is considered to be missing from the container frame. Conversion of a <null-frame> to string or vice versa is not supported (because it wouldn’t make much sense).
- Superclasses:
- <container-frame> Open Abstract Class#
Superclass of all binary data definitions using the
define binary-data
macro.- Superclasses:
- Operations:
field-count
- <header-frame> Open Abstract Class#
Superclass of all binary data definitions which support layering, thus have a header and payload.
- <variably-typed-container-frame> Open Abstract Class#
Superclass of all binary data definitions which have an abstract header followed by more fields. In the header a specific
<layering-field>
determines which subclass to instantiate.- Superclasses:
Tool API#
Parsing Frames#
- parse-frame Open Generic function#
Parses the given binary packet as frame-type, resulting in an instance of the frame-type and the number of consumed bits.
- Signature:
parse-frame frame-type packet #rest rest #key #all-keys => result consumed-bits
- Parameters:
frame-type – Any subclass of
<frame>
.packet – The byte vector as
<sequence>
.rest (#rest) – An instance of
<object>
.
- Values:
result – An instance of the given frame-type.
consumed-bits – The number of bits consumed as
<integer>
- read-frame Open Generic function#
Converts a given string to an instance of the given leaf frame type.
- Signature:
read-frame frame-type string => frame
- Parameters:
frame-type – An instance of
subclass(<leaf-frame>)
.string – An instance of
<string>
.
- Values:
frame – An instance of
<object>
.
Assembling Frames#
Information about Frames#
- frame-size Open Generic function#
Returns the length in bits for the given frame.
- Signature:
frame-size frame => length
- Parameters:
frame – An instance of
<frame>
.
- Values:
length – The size in bits, an instance of
<integer>
.
- summary Open Generic function#
Returns a human-readable customizable (in binary-data-definer) string, which summarizes the frame.
- packet Open Generic function#
Underlying byte vector of the given
<container-frame>
.- Signature:
packet frame => byte-vector
- Parameters:
frame – An instance of
<container-frame>
.
- Values:
byte-vector – An instance of
<byte-sequence>
.
- parent Sealed Generic function#
If the frame is a payload of another layer, returns the frame of the upper layer, false otherwise.
- Signature:
parent frame => parent-frame
- Parameters:
frame – An instance of
<container-frame>
or<variable-size-byte-vector-frame>
- Values:
parent-frame – Either the
<container-frame>
of the upper layer or#f
Information about Frame Types#
- fields Open Generic function#
Returns a vector of
<field>
for the given<container-frame>
- Signature:
fields frame-type => fields
- Parameters:
frame-type – Any subclass of
<container-frame>
.
- Values:
fields – A
<simple-vector>
containing all fields.
Note
Current API also allows instances of <container-frame>
, should be revised
- frame-name Open Generic function#
Returns the name of the frame type.
- Signature:
frame-name frame-type => name
- Parameters:
frame-type – Any subclass of
<container-frame>
.
- Values:
name – A
<string>
with the human-readable frame name.
Note
Current API also allows instances of <container-frame>
, should be revised
Fields#
Syntactic sugar in the define binary-data
domain-specific
language instantiates these fields.
- <field> Abstract Class#
The abstract superclass of all fields.
- Superclasses:
- Init-Keywords:
name – The name of this field.
fixup – A unary Dylan function computing the value of this field, used if no default is supplied and none provided by the client, defaults to
#f
.init-value – The default value if the client did not provide any, default $unsupplied.
static-end – A Dylan expression determining the end, defaults to
$unknown-at-compile-time
.static-length – A Dylan expression determining the length, defaults to
$unknown-at-compile-time
.static-start – A Dylan expression determining the start, defaults to
$unknown-at-compile-time
.dynamic-end – A unary Dylan function computing the end, defaults to
#f
.dynamic-length – A unary Dylan function computing the length, defaults to
#f
.dynamic-start – A unary Dylan function computing the start, defaults to
#f
.getter – The getter method to extract this fields value out of a concrete frame.
setter – The setter method to set this fields to a concrete value in a concrete frame.
index – An
<integer>
which is an index of this field in its<container-frame>
.
- Discussion:
All keyword arguments correspond to a slot, which can be accessed.
- Operations:
field-name(<field>)
fixup-function(<field>)
init-value(<field>)
static-start(<field>)
static-length(<field>)
static-end(<field>)
getter(<field>)
setter(<field>)
See also
- <variably-typed-field> Class#
The class for fields of dynamic type.
- Superclasses:
- Init-Keywords:
type-function – A unary Dylan function computing the type of the field, defaults to
payload-type
.
See also
- <statically-typed-field> Abstract Class#
The abstract superclass of all statically typed fields.
Note
restrict type in source code!
- <single-field> Class#
The common field. Nothing interesting going on here.
- Superclasses:
- <enum-field> Class#
An enumeration field to map
<integer>
to<symbol>
.- Superclasses:
- Init-Keywords:
mapping – A mapping from keys to values as
<collection>
.
- <layering-field> Class#
The layering field is used in
<header-frame>
and<variably-typed-container-frame>
to determine the concrete type of the payload or which subclass to use.- Superclasses:
- Discussion:
The
fixup-function
slot is bound to use the available layering information. No need to specify a fixup.
- <repeated-field> Abstract Class#
Abstract superclass of repeated fields. The
init-value
slot is bound to#()
.- Superclasses:
- <count-repeated-field> Class#
A repeated field whose number of repetitions is determined externally.
- Superclasses:
- Init-Keywords:
count – A unary function returning the number of occurences.
Layering of frames#
- payload-type Function#
The type of the payload, It is just a wrapper around
lookup-layer
, which returns<raw-frame>
iflookup-layer
returned false.- Signature:
payload-type frame => payload-type
- Parameters:
frame – An instance of
<container-frame>
.
- Values:
payload-type – An instance of
<type>
.
- lookup-layer Open Generic function#
Given a frame-type and a key, returns the type of the payload.
- reverse-lookup-layer Open Generic function#
Given a frame type and a payload, returns the value for the layering field.
Note
Check whether it can work with other types than integers
Database of Binary Data Formats#
Note
Rename to $binary-data-registry
or similar. Also, narrow types for the functions in this section.
- $protocols Constant#
A hash table with all defined binary formats. Insertion is done by a call of
define binary-data
.- Type:
- Value:
Mapping of
<symbol>
to subclasses of<container-frame>
.
- find-protocol Function#
Looks for the given name in the hashtable
$protocols
. Signals an error if no protocol with the given name can be found.
- find-protocol-field Function#
Queries a field by name in a given binary data format. Errors if no such field is known in the binary data format.
Utilities#
- hexdump Generic function#
Prints the given data in hexadecimal on the given stream.
- Signature:
hexdump stream data => ()
- Parameters:
stream – An instance of
<stream>
.data – An instance of
<sequence>
.
- Discussion:
Prints 8 bytes separated by a whitespace in hexadecimal, followed by two whitespaces, and another 8 bytes.
If the given data has more than 16 elements, it prints multiple lines, and prefix each with a line number (as 16 bit hexadecimal).
- byte-offset Function#
Computes the number of bytes for a given number of bits. A synonym for
rcurry(ash, 3)
.
- bit-offset Function#
Computes the number of bits which do not fit into a byte for a given number of bits. A synonym for
curry(logand, 7)
.
- byte-aligned Function#
Checks that the given number of bits can be represented in full bytes, otherwise signals an
<alignment-error>
.- Signature:
byte-aligned bits
- Parameters:
bits – An instance of
<integer>
.
- data Generic function#
Returns the underlying byte vector of a wrapper object, used for several untranslated leaf frames.
- Signature:
data (object) => (#rest results)
- Parameters:
object – An instance of
<object>
.
- Values:
#rest results – An instance of
<object>
.
Note
should be removed from the API, or become internal
Errors#
Extension API#
Extending Binary Data Formats#
This domain-specific language defines a subclass of
<container-frame>
, and lots of boilerplate.
- define binary-data Defining Macro#
- Macro Call:
define [abstract] binary-data *binary-format-name* ([*super-binary-format*]) [summary *summary*] [;] [over *over-spec* *] [;] [length *length-expression*] [;] [*field-spec*] [;] end
- Parameters:
binary-format-name – A standard Dylan class name.
super-binary-format – A standard Dylan name, used superclass.
summary – A Dylan expression consisting of a format-string and a list of arguments.
over-spec – A pair of binary format and value.
length-expression – A Dylan expression computing the length of a frame instance.
field-spec – A list of fields for this binary format.
- Discussion:
Defines the binary data class binary-data-name, which is a subclass of super-binary-format. In the body some syntactic sugar for specializing the pretty printer (summary specializes
summary
), providing a custom length implementation (length specializescontainer-frame-size
), and provide binary format layering information via over-spec (<layering-field>
). The remaining body is a list of field-spec. Each field-spec line corresponds to a slot in the defined class. Additionally, each field-spec instantiates an object of<field>
to store the static metadata. The vector of fields is available via the methodfields
.summary: *format-string* *format-arguments*
This generates a method implementation for
summary
. Each format-arguments is applied to the frame instance.over-spec: *over-binary-format* *layering-value*
The over-binary-format should be a subclass of
<header-frame>
or<variably-typed-container-frame>
. The layering-value will be registered for the specified over-binary-format.field-spec: [*field-attribute*] field *field-name* [:: *field-type*] [= *default-value*], [*keyword-arguments* *] [;] field-attribute: variably-typed | layering | repeated | enum mapping: { *key* <=> *value* }
field-name: Each field has a unique field-name, which is used as name for the getter and setter methods
field-type: The field-type can be any subclass of
<frame>
, required unlessvariably-typed
attribute provided.default-value: The default-value should be an instance of the given field-type.
field-attribute: Syntactic sugar for some common patterns is available via attributes.
variably-typed
instantiates a<variably-typed-field>
.layering
instantiates a<layering-field>
.repeated
instantiates a<repeated-field>
.enum
instantiates a<enum-field>
.
keyword-arguments: Depending on the field type, various keywords are supported. Lots of values are standard Dylan expressions, where the current frame object is implicitly bound to
frame
, indicated by frame-expression.fixup: A frame-expression computing the field value if no default was supplied, and the client didn’t provide one (handy for length fields).
start: A frame-expression computing the start bit of the field in the frame.
end: A frame-expression computing the end bit of the field in the frame.
length: A frame-expression computing the length of the field.
static-start: A Dylan expression stating the start of the field in the frame.
static-end: A Dylan expression stating the end of the field in the frame.
static-length: A Dylan expression stating the length of the field.
type-function: A frame-expression computing the type of this
<variably-typed-field>
.count: A frame-expression computing the amount of repetitions of this
<count-repeated-field>
.reached-end?: A frame-expression returning a
<boolean>
whether this<self-delimited-repeated-field>
has reached its end.mappings: A mapping for
<enum-field>
between values and<symbol>
The list of fields is instantiated once for each binary data definition. If a static start offset, length, and end offset can be trivially computed (using constant folding), this is done during macro processing.
Several generic functions can be specialized on the binary-format-name for custom behaviour:
Note
rename start, end, length to dynamic-start, dynamic-end, dynamic-length
Note
Check whether those field attributes compose in some way
- fixup! Open Generic function#
Fixes data in an assembled container frame.
- Signature:
fixup! frame => ()
- Parameters:
frame – A union of
<container-frame>
and<raw-frame>
. Usually specialized on a subclass of<unparsed-container-frame>
.
- Discussion:
Used for post-assembly of certain fields, such as checksum calculations in IPv4, ICMP, TCP frames, compression of domain names in DNS fragments.
Defining a Custom Leaf Frame#
A common structure in binary data formats are subsequent ranges of bits or bytes, each with a different meaning. There are some macros available to define frame types of common patterns.
- field-size Open Generic function#
Returns the static size of a given frame type. Should be specialized for custom fixed sized frames.
- high-level-type Open Generic function#
For translated frames, return the native Dylan type. Otherwise identity.
- Signature:
high-level-type frame-type => type
- Parameters:
frame-type – An instance of
subclass(<frame>)
.
- Values:
type – An instance of
<type>
.
- assemble-frame-into Open Generic function#
Shuffle the bits in the given packet so that the frame is encoded correctly.
- Signature:
assemble-frame-into frame packet => length
- Parameters:
frame – An instance of
<frame>
.packet – An instance of
<stretchy-vector-subsequence>
.
- Values:
length – An instance of
<integer>
.
- assemble-frame-into-as Open Generic function#
Shuffle the bits in the given packet so that the frame is encoded correctly as the given frame-type.
- Signature:
assemble-frame-into-as frame-type frame packet => length
- Parameters:
frame-type – A subclass of
<translated-frame>
.frame – An instance of
<object>
.packet – An instance of
<stretchy-vector-subsequence>
.
- Values:
length – An instance of
<integer>
.
- define n-bit-unsigned-integer Defining Macro#
Describes an
<integer>
represented by a bit vector of arbitrary size.- Macro Call:
define n-bit-unsigned-integer (*class-name* ; *bits* ) end
- Parameters:
class-name – A Dylan class name which is defined by this macro.
bits – The number of bits represented by this frame.
- Discussion:
Defines the class class-name with
<unsigned-integer-bit-frame>
as its superclass.There are several predefined classes of the form
<Kbit-unsigned-integer>
with K between 1 and 15, and 20.- Operations:
high-level-type
returnslimited(<integer>, min: 0, max: 2 ^ bits -1)
.field-size
returns bits.
- define n-byte-unsigned-integer Defining Macro#
Describes an
<integer>
represented by a byte vector of arbitrary size and encoding (little or big endian).- Macro Call:
define n-byte-unsigned-integer (*class-name-prefix* ; *bytes*) end
- Parameters:
class-name-prefix – A prefix for the class name which is defined by this macro.
bytes – The number of bytes represented by this frame.
- Discussion:
Defines the classes class-name-prefix
-big-endian-unsigned-integer>
(superclass<big-endian-unsigned-integer-byte-frame>
and class-name-prefix-little-endian-unsigned-integer>
(superclass<little-endian-unsigned-integer-byte-frame>
.The following classes are predefined:
<2byte-big-endian-unsigned-integer>
,<2byte-little-endian-unsigned-integer>
,<3byte-big-endian-unsigned-integer>
, and<3byte-little-endian-unsigned-integer>
.- Operations:
high-level-type
returnslimited(<integer>, min: 0, max: 2 ^ (8 * *bytes*) - 1
.field-size
returns bytes * 8.
- define n-byte-vector Defining Macro#
Defines a class with an underlying fixed size byte vector.
- Macro Call:
define n-byte-vector (*class-name* , *bytes*) end
- Parameters:
class-name – A standard Dylan class name.
bytes – The number of bytes represented by this frame.
- Discussion:
Defines the class class-name, as a subclass of
<fixed-size-byte-vector-frame>
. Callsdefine leaf-frame-constructor
with the given class-name (without surrounding angle brackets).- Operations:
field-size
returns bytes * 8.
- define leaf-frame-constructor Defining Macro#
Defines constructors for a given name.
- Macro Call:
define leaf-frame-constructor (*constructor-name*) end
- Parameters:
constructor-name – name of the constructor.
- Discussion:
Defines the generic function constructor-name and three specializations:
- Operations:
constructor-name
<byte-vector>
callsparse-frame
constructor-name
<collection>
, converts the<collection>
into a<byte-vector>
and calls constructor-name.constructor-name
<string>
, which callsread-frame
.
Predefined Leaf Frames#
- <unsigned-integer-bit-frame> Abstract Class#
The superclass of all bit frames, concrete classes are defined with the
define n-bit-unsigned-integer
.- Superclasses:
- Operations:
See also
- <boolean-bit> Class#
A single bit, at the Dylan level a
<boolean>
.The
high-level-type
returns<boolean>
. Thefield-size
returns 1.- Superclasses:
- <variable-size-byte-vector> Abstract Class#
A byte vector of arbitrary size, provided externally.
- Superclasses:
- <externally-delimited-string> Class#
A
<string>
of a certain length, externally delimited. The conversion methodas
is specialised on<string>
and<externally-delimited-string>
.- Superclasses:
Note
should be a variable-size translated leaf frame, if that is possible.
- <raw-frame> Class#
The bottom of the type hierarchy: if nothing is known, a
<raw-frame>
is all you can have.hexdump
can be used to inspect the frame contents.- Superclasses:
- <fixed-size-byte-vector-frame> Open Abstract Class#
A vector of any amount of bytes with a custom representation. Used amongst others for IP addresses, MAC addresses
- Superclasses:
- Init-Keywords:
data – The underlying byte vector.
- Operations:
See also
- <big-endian-unsigned-integer-byte-frame> Abstract Class#
A frame representing an
<integer>
of a certain size, depending on the size of the underlyaing byte vector.The macro
define n-byte-unsigned-integer-definer
defines subclasses with a certain size.- Superclasses:
- Operations:
See also
32 Bit Frames#
The <integer>
type in Dylan is represented by only 30
bits, thus 32 bit frames which should be represented as a
<number>
require a workaround. The workaround consists of using
<fixed-size-byte-vector-frame>
and converting to
<double-float>
values.
Note
This hack is awful and should be replaced by native 32 bit integers, or machine words.
- <big-endian-unsigned-integer-4byte> Class#
- Superclasses:
- <little-endian-unsigned-integer-4byte> Class#
- Superclasses:
- big-endian-unsigned-integer-4byte Generic function#
- Signature:
big-endian-unsigned-integer-4byte (data) => (#rest results)
- Parameters:
data – An instance of
<object>
.
- Values:
#rest results – An instance of
<object>
.
- little-endian-unsigned-integer-4byte Generic function#
- Signature:
little-endian-unsigned-integer-4byte (data) => (#rest results)
- Parameters:
data – An instance of
<object>
.
- Values:
#rest results – An instance of
<object>
.
- byte-vector-to-float-be Function#
- Signature:
byte-vector-to-float-be (bv) => (res)
- Parameters:
bv – An instance of
<stretchy-byte-vector-subsequence>
.
- Values:
res – An instance of
<float>
.
- byte-vector-to-float-le Function#
- Signature:
byte-vector-to-float-le (bv) => (res)
- Parameters:
bv – An instance of
<stretchy-byte-vector-subsequence>
.
- Values:
res – An instance of
<float>
.
- float-to-byte-vector-be Function#
- Signature:
float-to-byte-vector-be (float) => (res)
- Parameters:
float – An instance of
<float>
.
- Values:
res – An instance of
<byte-vector>
.
- float-to-byte-vector-le Function#
- Signature:
float-to-byte-vector-le (float) => (res)
- Parameters:
float – An instance of
<float>
.
- Values:
res – An instance of
<byte-vector>
.
Stretchy Vector Subsequences#
The underlying byte vector which is used in binary data is a
<stretchy-byte-vector>
. To allow zerocopy while parsing, and
providing each frame parser only with a byte vector of the required
size for the type, there is a <stretchy-vector-subsequence>
which tracks the byte-vector together with a start and end index.
Note
Should live in a separate module and types can be narrowed a bit further.
- <stretchy-vector-subsequence> Abstract Class#
- Superclasses:
<vector>
- Init-Keywords:
data –
end –
start –
- subsequence Generic function#
- Signature:
subsequence (seq) => (#rest results)
- Parameters:
seq – An instance of
<object>
.
- Values:
#rest results – An instance of
<object>
.
- <stretchy-byte-vector-subsequence> Class#
- Superclasses:
- decode-integer Generic function#
- Signature:
decode-integer (seq count) => (#rest results)
- Parameters:
seq – An instance of
<object>
.count – An instance of
<object>
.
- Values:
#rest results – An instance of
<object>
.
- encode-integer Generic function#
- Signature:
encode-integer (value seq count) => (#rest results)
- Parameters:
value – An instance of
<object>
.seq – An instance of
<object>
.count – An instance of
<object>
.
- Values:
#rest results – An instance of
<object>
.