Книга: Practical Common Lisp

Binary Format Basics

Binary Format Basics

The starting point for reading and writing binary files is to open the file for reading or writing individual bytes. As I discussed in Chapter 14, both OPEN and WITH-OPEN-FILE accept a keyword argument, :element-type, that controls the basic unit of transfer for the stream. When you're dealing with binary files, you'll specify (unsigned-byte 8). An input stream opened with such an :element-type will return an integer between 0 and 255 each time it's passed to READ-BYTE. Conversely, you can write bytes to an (unsigned-byte 8) output stream by passing numbers between 0 and 255 to WRITE-BYTE.

Above the level of individual bytes, most binary formats use a smallish number of primitive data types—numbers encoded in various ways, textual strings, bit fields, and so on—which are then composed into more complex structures. So your first task is to define a framework for writing code to read and write the primitive data types used by a given binary format.

To take a simple example, suppose you're dealing with a binary format that uses an unsigned 16-bit integer as a primitive data type. To read such an integer, you need to read the two bytes and then combine them into a single number by multiplying one byte by 256, a.k.a. 2^8, and adding it to the other byte. For instance, assuming the binary format specifies that such 16-bit quantities are stored in big-endian[262] form, with the most significant byte first, you can read such a number with this function:

(defun read-u2 (in)
(+ (* (read-byte in) 256) (read-byte in)))

However, Common Lisp provides a more convenient way to perform this kind of bit twiddling. The function LDB, whose name stands for load byte, can be used to extract and set (with SETF) any number of contiguous bits from an integer.[263] The number of bits and their position within the integer is specified with a byte specifier created with the BYTE function. BYTE takes two arguments, the number of bits to extract (or set) and the position of the rightmost bit where the least significant bit is at position zero. LDB takes a byte specifier and the integer from which to extract the bits and returns the positive integer represented by the extracted bits. Thus, you can extract the least significant octet of an integer like this:

(ldb (byte 8 0) #xabcd) ==> 205 ; 205 is #xcd

To get the next octet, you'd use a byte specifier of (byte 8 8) like this:

(ldb (byte 8 8) #xabcd) ==> 171 ; 171 is #xab

You can use LDB with SETF to set the specified bits of an integer stored in a SETFable place.

CL-USER> (defvar *num* 0)
*NUM*
CL-USER> (setf (ldb (byte 8 0) *num*) 128)
128
CL-USER> *num*
128
CL-USER> (setf (ldb (byte 8 8) *num*) 255)
255
CL-USER> *num*
65408

Thus, you can also write read-u2 like this:[264]

(defun read-u2 (in)
(let ((u2 0))
(setf (ldb (byte 8 8) u2) (read-byte in))
(setf (ldb (byte 8 0) u2) (read-byte in))
u2))

To write a number out as a 16-bit integer, you need to extract the individual 8-bit bytes and write them one at a time. To extract the individual bytes, you just need to use LDB with the same byte specifiers.

(defun write-u2 (out value)
(write-byte (ldb (byte 8 8) value) out)
(write-byte (ldb (byte 8 0) value) out))

Of course, you can also encode integers in many other ways—with different numbers of bytes, with different endianness, and in signed and unsigned format.

Оглавление книги


Генерация: 8.199. Запросов К БД/Cache: 3 / 0
поделиться
Вверх Вниз