Alternative University

Computer Science

Raster Image Storage

Higher Precision Extended Range

XRH File Format 1.0

Zebra Compression

If the Compression Type integer of a Channel Block is 0x5A4201000000 (Zebra), a Zebra Compression Stream immediately follows the Compression Type integer, after which the end-of-channel-block marker follows the Zebra Compression Stream.

Figure 1:  Channel Block (image plane) of an XRH 1.0 file, with Compression Type set to 0x5A4201000000, stores a Zebra Compression Stream as the Channel Block Data (yellow in this diagram).

Zebra is a simple lossless compression scheme we pieced together for lossless compression of floating point numbers. It consists of splitting up floating point number samples into bytes, and storing the corresponding bytes of the different samples together: the most significant byte of all the samples are stored together, the next most significant byte of the samples are stored together, etc.

Each such group of bytes is called a Byte Channel. If the samples are FLOAT32 numbers (4 bytes per sample), then there are 4 Byte Channels. If the samples are FLOAT64 numbers (8 bytes per sample), then there are 8 Byte Channels.

Byte Channels are stored in Big Endian format:  most significant Byte Channel first, next most significant Byte Channel next, etc.;  the least significant byte of all the image channel samples are the last Byte Channel, the second least significant byte is the second-to-last Byte Channel, etc.


Zebra Compression Stream

The Zebra Compression Stream begins with a start-of-zebra marker, followed with a 64-bit Size integer that stores the size of the Zebra Compression Stream (including the start-of-zebra marker and this Size integer), followed with a 32-bit integer that stores the Sample Stride (must equal the Channel Block Sample Stride), followed with a 32-bit integer that stores the Image Width (must equal the File Header Image Width), followed with a 32-bit integer that stores the Image Height (same as File Header Image Height), followed with Byte Channels (4 Byte Channels for FLOAT32 or 8 Byte Channels for FLOAT64), followed with the end-of-zebra marker that marks the end of the Zebra stream.

Figure 2:  Zebra Compression Stream as the Data of a Channel block.

Note: The Zebra Compression Stream is self-contained and may be used in any file format (not just in an XRH file).

The first four bytes of a Zebra Compression Stream store the value 0x535A4200 (ASCII string SZB\0) which marks the start-of-zebra.

Start of Zebra Stream Marker:  SZB\0

The last 4 bytes of a Zebra Compression Stream store the value 0x455A4200 (ASCII string EZB\0) which marks the end-of-zebra (the last 4 bytes of the Zebra Compression Stream).

End of Zebra Stream Marker:  EZB\0


Byte Channel

The data bytes of a Byte Channel are compressed using ZST o (a lossless compressor, also called “Zstandard”).

Each Byte Channel consists of a start-of-byte-channel marker, followed with a 64-bit integer that specifies the size of the ZST stream that stores the data bytes of the Byte Channel, followed with that ZST stream, followed with an end-of-byte-channel marker.

Figure 3:  Byte Channel.

The first four bytes of a Byte Channel store the value 0x53424300 (ASCII string SBC\0) which marks the start-of-byte-channel.

Start of Byte Channel Marker:  SBC\0

The last 4 bytes of a Byte Channel store the value 0x45424300 (ASCII string EBC\0) which marks the end-of-byte-channel.

End of Byte Channel Marker:  EBC\0

The ZST stream of each Byte Channel is a compressed byte stream that when uncompressed provides the sample bytes of the Byte Channel in raster order.

For example, to reconstruct the 5th sample of a Channel Block: the 5th data byte of the first Byte Channel becomes the high order byte of the 5th sample of the Channel Block, the 5th data byte of the second Byte Channel becomes the next most significant byte of the 5th sample of the Channel Block, etc.

Before compression, each sample must be converted (mapped) from floating point to unsigned integer. To perform this mapping, first copy the floating point bit sequence of each sample directly into integer memory (bitwise bijection), then toggle the leading sign bit if positive, or toggle all bits if negative. The mapped samples (unsigned integers) are then split up into Byte Channels, and each Byte Channel is sent to the ZST encoder.

For data retrieval, after uncompressing, the byte channels are merged (shuffled) back into unsigned integers which are then inverse mapped back to the original floating point numbers.


< Previous Page: Channels    
    Next: SZ219 Compression >
XRH File Format
Page 1 : 
Page 2 : 
Page 3 : 
Page 4 : 
Page 5 : 
Page 6 : 
Page 7 : 
Page 8 : 
Page 9 : 
Introduction
File Layout & Header
Comments
Custom Properties
Channel Names
Channels
Zebra Compression (this page)
SZ219 Compression
Color Transformations

Return to Computer Science

Return to Alternative University

Copyright © 2020 Arc Math Software, All rights reserved
Arc Math Software, P.O. Box 221190, Sacramento CA 95822 USA   Contact
2020–Oct–26  05:08  UTC