Python. Module struct. Packing / unpacking data. Basic methods

Module struct. Packing / unpacking data. Basic methods


Contents


Search other websites:




1. Using the struct module. Packed binary data

The Python struct module is used to create and pull packed binary data from strings. In the struct module, data bytes are interpreted as packed binary data that can be represented by objects of type bytes or bytearray.

The module contains conversion tools between Python values and C structures, which are represented as Python byte objects. Such conversions are used in processing binary data that is stored in files or obtained from network connections, etc.

To provide a compact description of C-structures and conversion to values (from values) of Python, format strings are used.

 

2. The basic methods of the struct module

The struct module contains several basic methods that you can use to pack and unpack data.

2.1. The pack() and unpack() methods. Packing and unpacking data

For packing and unpacking data, the methods pack(), unpack() are used. The packing/unpacking process is implemented according to the format string.

According to the documentation, the general form of using the pack() method is as follows

obj = struct.pack(format, v1, v2, ...)

where

  • format – format string. This line is formed in accordance with the rules laid down in the tables (see paragraph 3);
  • v1, v2, … – values (objects) to be packed;
  • obj – packed binary object.

The unpack() function performs the inverse of the pack() operation. It allows you to get the source object based on the packed object. The general form of using the function is as follows:

obj = struct.unpack(format, buffer)

here

  • buffer – a buffer in which an object that was previously packaged by the pack() function is written. The size of this object must match the size specified in format;
  • format – a format string based on which an unpacked binary object obj is obtained;
  • obj – the resulting object, which can be a list, a tuple, a set, a dictionary, etc.

When calling the unpack() function, the format string must match the same string that was specified by the pack() function.

Example. For the purpose of demonstration, packing/unpacking of the list of numbers is carried out.

# Module struct. Methods pack(), unpack()
# Pack/unpack list of numbers

# 1. Specified list of numbers
LS = [ 1, 3, 9, 12 ]

# 2. Include module struct
import struct

# 3. Pack list of numbers. Method pack()
pack_obj = struct.pack('>4і', LS[0], LS[1], LS[2], LS[3])

# 4. Display the object pack_obj
print('pack_obj = ', pack_obj)

# 5. Unpack list of numbers. Method unpack().
#   The result is a tuple T2
T2 = struct.unpack('>4і', pack_obj) # T2 = (1, 3, 9, 12)

# 6. Print the unpacked object T2
print('T2 = ', T2)

# 7. Convert tuple T2 to list LS2
LS2 = list(T2) # LS2 =   [1, 3, 9, 12]

# 8. Display the list LS2
print('LS2 = ', LS2)

the result of the program

pack_obj =   b'\x00\x00\x00\x01\x00\x00\x00\x03\x00\x00\x00\t\x00\x00\x00\x0c'
T2 = (1, 3, 9, 12)
LS2 = [1, 3, 9, 12]

 

2.2. Method calcsize(). The size of packed object

The calcsize() method returns the size of the object created by the pack() method.

Example.

# Module struct.
# Method calcsize(). Determine the size of the packed object

# 1. Include module struct
import struct

# 2. Determine the size of a packed list of numbers
# 2.1. Specified list of floating point numbers
LS = [ 2.88, 3.9, -10.5 ]

# 2.2. Pack list LS. Method pack()
pack_obj = struct.pack('>3f', LS[0], LS[1], LS[2])

# 2.3. Display the packed object pack_obj
print('pack_obj = ', pack_obj)

# 2.4. Display the size of pack_obj
size = struct.calcsize('>3f') # size = 12
print('size = ', size)

# 3. Determine the size of a packed tuple of strings
# 3.1. The specified tuple of two strings
TS = ( 'Hello', 'abcd')

# 3.2. Pack the tuple TS
pack_obj = struct.pack('<5s4s', TS[0].encode(), TS[1].encode())

# 3.3. Display the packed object
print('pack_obj = ', pack_obj)

# 3.4. Display the size of packed tuple
size = struct.calcsize('<5s4s') # size = 9
print('size = ', size)

The result of the program

pack_obj = b'@8Q\xec@y\x99\x9a\xc1(\x00\x00'
size = 12
pack_obj = b'Helloabcd'
size = 9

 

3. Formatted strings
3.1. Set byte order, size and alignment based on format character

In Python, the way a string is packed is determined based on the first character of the format string. This symbol defines:

  • the byte order, which is formed using the characters @, =, <,>, !. If this parameter is not specified, the @ symbol is accepted;
  • the size in bytes of packed data. In this case, the numbers that indicate the number are used first;
  • alignment, which is set by the system.

According to Python documentation in the format string, byte order, size and alignment are formed according to the first character of the format. The possible first characters of the format are shown in the following table.

Character Byte order Size Alignment
@ native (host dependent) Native Native
= native standard none
< little-endian standard none
> big-endian standard none
! network (= big-endian) standard none

A byte order value can be one of 4:

  • native order. This order can be either little-endian or big-endian. This order is determined by the host system;
  • order of type little-endian. In this order, the low byte is processed first, and then the high byte;
  • order of type big-endian. In this case, the high byte is processed first, and then the low byte;
  • network order, which defaults to big-endian order.

The size of the packed data can be one of two things:

  • native – defined using the sizeof C compiler instructions;
  • standard – is determined based on the format character in accordance with the table below.

Table. Definition of the standard size of packed data depending on the format character

Format C Type Python Type Standard size
x pad byte no value
c char bytes of length 1 1
b signed char integer 1
B unsigned char integer 1
? _Bool bool 1
h short integer 2
H unsigned short integer 2
i int integer 4
I unsigned int integer 4
l long integer 4
L unsigned long integer 4
q long long integer 8
Q unsigned long long integer 8
n ssize_t integer
N size_t integer
e float (exponential format) float 2
f float float 4
d double float 8
s char[] bytes
p char[] bytes
P void* integer

 

3.2. Examples of formatted strings for different data types

 

ii   - two numbers of type int
2i   - two numbers of type int
10f  - 10 numbers of type float
>i8s - byte order big-endian, int-number, string of 8 characters
8dif - 8 numbers of type double, 1 number of type int, 1 float number
=bi  - native order, bool-value, int-number

 


Related topics