Draft of the QCMP v2 format.

058f4fbd · Vojtech Moravec · 64609f08 · 058f4fbd
Commit 058f4fbd authored 4 years ago by Vojtech Moravec
--- a/FileFormat.md
+++ b/FileFormat.md
@@ -2,7 +2,8 @@
 QCMP (Quantization compressed) file formats:
- [Version 1](#first-version-of-qcmp-file-header)
+- [Version 1](#qcmp-file-format-v1)
+- [Version 2](#qcmp-file-format-v2)
 QVC (Quantization Value Cache) file formats:
 - [Version 1](#qvc-file-format-v1)
@@ -10,8 +11,9 @@ QVC (Quantization Value Cache) file formats:
 Enumerations shared by all formats:
 - [Quantization type enumeration](#quantization-type-enumeration)
+- [Codebook type enumeration](#codebook-type-enumeration)
-## First version of QCMP file header
+## QCMP File Format V1
 ### Pros:
 - Very compact.
@@ -19,68 +21,100 @@ Enumerations shared by all formats:
 ### Cons:
 - No support for hyperstack images. (Images with multiple channels or timepoints)
 - No support for metadata.
- No reserver bytes left for future improvements.
+- No reserved bytes left for future improvements.
-| Offset          | Size   | Type        | Possible values                        | Note                                                |
+| Offset              | Size         | Type         | Note                                        |
-|-----------------|--------|-------------|----------------------------------------|-----------------------------------------------------|
+| ------------------- | ------------ | ------------ | ------------------------------------------- |
-|0                |8       |ASCII String |QCMPFILE                                |Magic value                                          |
+| 0                   | 8            | ASCII String | `QCMPFILE` Magic value                      |
-|8                |1       |u8           |0 = SQ, 1 = VQ 1D, 2 = VQ 2D, 3 = VQ 3D |Quantization type                                    |
+| 8                   | 1            | u8           | Quantization type                           |
-|9                |1       |u8           |1 – 255                                 |Bits Per Pixel (BPP), 2^BPP = Codebook size          |
+| 9                   | 1            | u8           | Bits Per Pixel (BPP), 2^BPP = Codebook size |
-|10               |1       |u8           |0, 1                                    |0 = One codebook for all, 1 = One Codebook per plane |
+| 10                  | 1            | u8           | Codebook type                               |
-|11               |2       |u16          |0 - 65535                               |Image size X                                         |
+| 11                  | 2            | u16          | Image size X                                |
-|13               |2       |u16          |0 - 65535                               |Image size Y                                         |
+| 13                  | 2            | u16          | Image size Y                                |
-|15               |2       |u16          |0 - 65535                               |Image size Z (ZSize)                                 |
+| 15                  | 2            | u16          | Image size Z                                |
-|17               |2       |u16          |0 - 65535                               |Vector size X                                        |
+| 17                  | 2            | u16          | Vector size X                               |
-|19               |2       |u16          |0 - 65535                               |Vector size Y                                        |
+| 19                  | 2            | u16          | Vector size Y                               |
-|21               |2       |u16          |0 - 65535                               |Vector size Z                                        |
+| 21                  | 2            | u16          | Vector size Z                               |
-|23               |4*ZSise |u32          |                                        |Plane data sizes                                     |
+| 23                  | 4*ChunkCount | u32[]        | Size of compressed image chunks             |
-|23 + (4*ZSise)   |        |u8[]         |                                        |Data                                                 |
+| 23 + (4*ChunkCount) |              | u8[]         | Data                                        |
+### ChunkCount
+Chunk count is the number of compressed image chunks. For all quantization types, except `Vector3D`, the chunk count is equal to plane count (Image size Z), as each plane is compressed separately.
+But when `Vector3D` quantization is used voxel layers are compressed instead of individual planes. Voxel layer is a stack of multiple planes, exactly `Vector size Z` planes aka voxel depth.
+```java
+final int chunkCount = (quantizationType != QuantizationType.Vector3D)
+                        ? imageSizeZ
+                        : imageSizeZ / vectorSizeZ;
+```
 ### Note on data sector:
 Data sector consists of codebook data and indices data. If the file uses a single codebook then it is located at the start of the data sector. After the codebook is read, only image data remains to be decoded.
 Otherwise (codebook per plane), there are always codebook data followed by the plane indices followed by another plane codebook and so on.
+## QCMP File Format V2
+The second iteration of QCMP format solves the problem of the first. Hyperstack dimensions are supported and also metadata can be saved to the compressed file. We have also reserved 19 bytes for the future.
+| Offset                   | Size         | Type         | Note                                        |
+| ------------------------ | ------------ | ------------ | ------------------------------------------- |
+| 0                        | 8            | ASCII String | `QCMPFLV2` Magic value                      |
+| 8                        | 1            | u8           | Quantization type                           |
+| 9                        | 1            | u8           | Bits Per Pixel (BPP), 2^BPP = Codebook size |
+| 10                       | 1            | u8           | Codebook type                               |
+| 11                       | 2            | u16          | Image size X                                |
+| 13                       | 2            | u16          | Image size Y                                |
+| 15                       | 2            | u16          | Image size Z                                |
+| 17                       | 2            | u16          | Channel count                               |
+| 19                       | 2            | u16          | Number of timepoints                        |
+| 21                       | 2            | u16          | Vector size X                               |
+| 23                       | 2            | u16          | Vector size Y                               |
+| 25                       | 2            | u16          | Vector size Z                               |
+| 27                       | 4            | u32          | MS=Metadata size                            |
+| 31                       | 19           | u8           | Reserved bytes                              |
+| 50                       | MS           | u8[]         | Metadata                                    |
+| 50 + MS                  | 4*ChunkCount | u32[]        | Size of compressed image chunks             |
+| 50 + MS + (4*ChunkCount) |              | u8[]         | Data                                        |
 ## QVC File Format V1
 QCMP cache file (QVC) is used to store trained codebook for image file. The coder can load the cache file and encode the source image directly without needing to learn the codebook.
 In the first version the Huffman tree is recontructed from the absolute frequencies of codebook indices, which is not space optimal.
-| Sector    | Offset  | Size   | Type        | Note                                                |
+| Sector     | Offset  | Size | Type         | Note                             |
-| --------- |---------|--------|-------------|-----------------------------------------------------|
+| ---------- | ------- | ---- | ------------ | -------------------------------- |
-| **Header**|         |        |             |                                                     |
+| **Header** |         |      |              |                                  |
-|           |0        |9       |ASCII String |`QCMPCACHE` Magic value                              |
+|            | 0       | 9    | ASCII String | `QCMPCACHE` Magic value          |
-|           |9        |1       |u8           |Quantization type                                    |
+|            | 9       | 1    | u8           | Quantization type                |
-|           |10       |2       |u16          |Codebook size                                        |
+|            | 10      | 2    | u16          | Codebook size                    |
-|           |12       |2       |u16          |STFN=Size of the train file name                     |
+|            | 12      | 2    | u16          | STFN=Size of the train file name |
-|           |14       |STFN    |ASCII String |Train file name                                      |
+|            | 14      | STFN | ASCII String | Train file name                  |
-|           |14+STFN  |2       |u16          |Vector size X                                        |
+|            | 14+STFN | 2    | u16          | Vector size X                    |
-|           |16+STFN  |2       |u16          |Vector size Y                                        |
+|            | 16+STFN | 2    | u16          | Vector size Y                    |
-|           |18+STFN  |2       |u16          |Vector size Z                                        |
+|            | 18+STFN | 2    | u16          | Vector size Z                    |
-| **Data**  |         |                      |                                                     |
+| **Data**   |         |      |              |
-|           |         |        |u16[]        |Quantization values                                  |
+|            |         |      | u16[]        | Quantization values              |
-|           |         |        |u64[]        |Huffman symbol frequencies                           |
+|            |         |      | u64[]        | Huffman symbol frequencies       |
 ## QVC File Format V2
 Second version of QVC format is based on the first version and Header sector is almost the same, we just added the size of the huffman binary data.
 The difference is in the data sector and in the binary representation of Huffman tree.
-| Sector    | Offset  | Size   | Type        | Note                                                |
+| Sector     | Offset  | Size | Type         | Note                                     |
-| --------- |---------|--------|-------------|-----------------------------------------------------|
+| ---------- | ------- | ---- | ------------ | ---------------------------------------- |
-| **Header**|         |        |             |                                                     |
+| **Header** |         |      |              |                                          |
-|           |0        |9       |ASCII String |`QVCFILEV2` Magic value                              |
+|            | 0       | 9    | ASCII String | `QVCFILEV2` Magic value                  |
-|           |9        |1       |u8           |Quantization type                                    |
+|            | 9       | 1    | u8           | Quantization type                        |
-|           |10       |2       |u16          |Codebook size                                        |
+|            | 10      | 2    | u16          | Codebook size                            |
-|           |12       |2       |u16          |STFN=Size of the train file name                     |
+|            | 12      | 2    | u16          | STFN=Size of the train file name         |
-|           |14       |STFN    |ASCII String |Train file name                                      |
+|            | 14      | STFN | ASCII String | Train file name                          |
-|           |14+STFN  |2       |u16          |Vector size X                                        |
+|            | 14+STFN | 2    | u16          | Vector size X                            |
-|           |16+STFN  |2       |u16          |Vector size Y                                        |
+|            | 16+STFN | 2    | u16          | Vector size Y                            |
-|           |18+STFN  |2       |u16          |Vector size Z                                        |
+|            | 18+STFN | 2    | u16          | Vector size Z                            |
-|           |20+STFN  |2       |u16          |Huffman binary data size                             |
+|            | 20+STFN | 2    | u16          | Huffman binary data size                 |
-|           |22+STFN  |10      |u8           |Reserved bytes                                       |
+|            | 22+STFN | 10   | u8           | Reserved bytes                           |
-| **Data**  |         |                      |                                                     |
+| **Data**   |         |      |              |
-|           |         |        |u16[]        |Quantization values                                  |
+|            |         |      | u16[]        | Quantization values                      |
-|           |         |        |u8[]         |Binary encoded Huffman tree with symbols             |
+|            |         |      | u8[]         | Binary encoded Huffman tree with symbols |
 ### Quantization type enumeration
 Type is encoded using a single byte.
@@ -88,3 +122,7 @@ Type is encoded using a single byte.
 - `1` - Vector Quantization 1D (Row vector)
 - `2` - Vector Quantization 2D (Matrix vector)
 - `3` - Vector Quantization 3D (Tensor vector)
+### Codebook type enumeration
+- `0` - Global/Universal codebook
+- `1` - Individual codebook
\ No newline at end of file