An XRH file stores data as a rectangular array of entries. Each entry is a set of samples. Each entry has the same number of samples as any other entry in the array.
For example, if the data represent an RGB (color) image, each entry is a pixel (picture element) that has three samples per entry:
The set of corresponding samples of all entries is called a channel. For example:
A channel is also called an image plane.
We refer to data that will be stored in an XRH file as Original Data (abbreviated ORIG).
The following types of Original Data are supported in this XRH version:
ORIG_IEEE is the standard IEEE floating point number type that has 4 or 8 bytes (sample width of 32 or 64 bits respectively). These are the two native floating point formats of all personal computers, built into the computer hardware.
Sample width is the number of bits in a sample. It is the size of the sample in bits (there are 8 bits in each byte).
ORIG_UINT is the standard unsigned integer number type, with size equal to a whole number of bytes (sample width must be a multiple of 8, since there are 8 bits in a byte and the size of ORIG_UINT must be a whole number of bytes).
Technical note: ORIG_IEEE is only for standard built-in IEEE floating point numbers. If you need to store custom floating point numbers, either convert the samples to ORIG_IEEE format, or specify ORIG_UINT to treat the custom floating point numbers as unsigned integers.
Storage Data may be the same numeric type as the Original Data, or it may be a different type.
All samples in an XRH file are compressed. The Storage Data type depends on what type of compression is used.
Each channel specifies its own type of compression. Different channels may use different compression types. Within a channel, all samples are compressed using the same compression.
If the compression type supports the Original Data type, then the Storage Data type can be the same as the Original Data type.
If the compression type does not support the Original Data type, then the data must be converted to one of the data types that the compression supports.
Technical note: Sometimes it is useful to use a different data type even if the compression supports the Original Data type. That will be discussed later.
The following Storage Data types are supported in this XRH version:
Additional Storage Data types may be added in the future.
Not all compression types support all Storage Data types. Each compression type will specify which Storage Data types it supports.
The Storage Data types are defined as follows.
STOR_IEEE is the standard 32 or 64 bit IEEE floating point number type (same as ORIG_IEEE). This is only supported for SZMod compression.
An IEEE floating point number has a leading sign bit, followed with an exponent, followed with the significand (mantissa):
The bit positions within a number are zero-based, beginning with the least significant bit (on the right in this diagram).
For a 32-bit floating point number,
the sign bit is stored in
STOR_UINT is the standard unsigned integer type (same as ORIG_UINT). This is supported for Zebra and JP2K compression.
Zebra compression supports one or more bytes of STOR_UINT type.
JP2K supports STOR_UINT type Sample Width of 24 or 16 bits, with Padding of 8 or 16 bits respectively.
The IEEE Bit Filter (STOR_IEEE_FILT) is a lossless encoding of IEEE floating point numbers, to filter the data for better ZST compressibility. This is only supported for Zebra compression.
For this encoding, each sample is converted from IEEE floating point to unsigned integer, using a variation of a mapping that was developed by Lindstrom and Isenburg.
To perform this mapping, first copy the
floating point bit sequence of each sample
directly into integer memory of the same byte length (bitwise bijection),
then toggle the leading sign bit if positive,
or toggle all bits if negative.
And then move the sign bit,
from bit position 31,
to bit position 23,
so that the exponent moves from
For data retrieval, after uncompressing the Byte Channels, the Byte Channels are merged (shuffled) back into unsigned integers. The unsigned integers are then inverse mapped (reverse filtered) back to the original IEEE floating point numbers.
The Unsigned 24-bit Float (STOR_U_FLOAT24) is a lossy encoding
A bias only needs to be added if any of the numbers in the channel are negative. In that case, the same bias is used for all numbers of the channel.
Adding a bias to make the number non-negative assures the sign bit is clear, at the expense of not allowing numbers at the high end of the IEEE exponent range.
With the sign bit cleared, the number is upshifted,
The resulting 3-byte number has no sign bit,
The Unsigned Scaled Integer (STOR_USI_FLOAT24)
is a usually lossy encoding of
Then the non-negative
Technical note: The scalar specifies how many unique sample values there can be from one integer to the next. For example, if the scalar is 2500.0, there are 2500 possible unique sample values from 0.0 to 1.0, there are 2500 possible values from 1.0 to 2.0, etc.
After the floating point number is scaled
(multiplied by the scalar),
it is cast to unsigned integer,
which truncates the fraction part of
the scaled number.
The lower 3 bytes
of the resulting integer
For JP2K compression, each
Technical note: Other types of data encoding for JP2K compression may be possible. Let us know if you get a different method working that you would like to recommend.
The samples of a channel are defined to be the same type, size and padding for the entire channel. Different channels can have different types of samples, but within a channel all of the samples are the same type, size and padding.
The Sample Definition (abbr. Sample Def) of a channel is a sequence of four bytes that specifies the type, size and padding of all the samples in the channel.
The first byte (8 bits) of the Sample Definition specifies the type of sample, and must be one of the following values:
The third and fourth bytes (16 bits) of the Sample Definition specifies the sample width (number of bits in each sample); and the second byte (8 bits) specifies how many pad bits are inserted between samples (how many blank bits are inserted between the end of a sample and the beginning of the next sample, usually zero).
XRH files store multi-byte numbers in network byte order (big endian),
so that these four bytes can be stored as a