From 21ab7bb2733291e7e2cee1d315b682d3cc1b106a Mon Sep 17 00:00:00 2001 From: Vojtech Moravec <vojtech.moravec.st@vsb.cz> Date: Wed, 2 Dec 2020 10:35:24 +0100 Subject: [PATCH] Create V2 of QvcFile format. --- FileFormat.md | 67 ++++++++++++++++++++++++++++++++++++++++----------- 1 file changed, 53 insertions(+), 14 deletions(-) diff --git a/FileFormat.md b/FileFormat.md index f7143ce..26a6d8e 100644 --- a/FileFormat.md +++ b/FileFormat.md @@ -1,6 +1,16 @@ # File formats used by QcmpCompressionLibrary +QCMP (Quantization compressed) file formats: +- [Version 1](#first-version-of-qcmp-file-header) + +QVC (Quantization Value Cache) file formats: +- [Version 1](#qvc-file-format-v1) +- [Version 2](#qvc-file-format-v2) + +Enumerations shared by all formats: +- [Quantization type enumeration](#quantization-type-enumeration) + ## First version of QCMP file header ### Pros: @@ -31,20 +41,49 @@ Data sector consists of codebook data and indices data. If the file uses a singl Otherwise (codebook per plane), there are always codebook data followed by the plane indices followed by another plane codebook and so on. -## First version of QCMP cache file header -QCMP cache file is used to store trained codebook for image file. The coder can load the cache file and encode the source image directly without needing to learn the codebook. +## QVC File Format V1 +QCMP cache file (QVC) is used to store trained codebook for image file. The coder can load the cache file and encode the source image directly without needing to learn the codebook. In the first version the Huffman tree is recontructed from the absolute frequencies of codebook indices, which is not space optimal. -| Offset | Size | Type | Possible values | Note | -|---------|--------|-------------|----------------------------------------|-----------------------------------------------------| -|0 |9 |ASCII String |QCMPCACHE |Magic value | -|9 |1 |u8 |0 = SQ, 1 = VQ 1D, 2 = VQ 2D, 3 = VQ 3D |Quantization type | -|10 |2 |u16 |1 – 65535 |Codebook size | -|12 |2 |u16 | |STFN=Size of the train file name | -|14 |STFN |ASCII String |0 - 65535 |Train file name | -|14+STFN |2 |u16 |0 - 65535 |Vector size X | -|16+STFN |2 |u16 |0 - 65535 |Vector size Y | -|18+STFN |2 |u16 |0 - 65535 |Vector size Z | -| | |u16[] | |Quantization values | -| | |u64[] | |Huffman symbol frequencies | +| Sector | Offset | Size | Type | Note | +| --------- |---------|--------|-------------|-----------------------------------------------------| +| **Header**| | | | | +| |0 |9 |ASCII String |`QCMPCACHE` Magic value | +| |9 |1 |u8 |Quantization type | +| |10 |2 |u16 |Codebook size | +| |12 |2 |u16 |STFN=Size of the train file name | +| |14 |STFN |ASCII String |Train file name | +| |14+STFN |2 |u16 |Vector size X | +| |16+STFN |2 |u16 |Vector size Y | +| |18+STFN |2 |u16 |Vector size Z | +| **Data** | | | | +| | | |u16[] |Quantization values | +| | | |u64[] |Huffman symbol frequencies | + +## QVC File Format V2 +Second version of QVC format is based on the first version and Header sector is almost the same, we just added the size of the huffman binary data. +The difference is in the data sector and in the binary representation of Huffman tree. +| Sector | Offset | Size | Type | Note | +| --------- |---------|--------|-------------|-----------------------------------------------------| +| **Header**| | | | | +| |0 |9 |ASCII String |`QVCFILEV2` Magic value | +| |9 |1 |u8 |Quantization type | +| |10 |2 |u16 |Codebook size | +| |12 |2 |u16 |STFN=Size of the train file name | +| |14 |STFN |ASCII String |Train file name | +| |14+STFN |2 |u16 |Vector size X | +| |16+STFN |2 |u16 |Vector size Y | +| |18+STFN |2 |u16 |Vector size Z | +| |20+STFN |2 |u16 |Huffman binary data size | +| |22+STFN |10 |u8 |Reserved bytes | +| **Data** | | | | +| | | |u16[] |Quantization values | +| | | |u8[] |Binary encoded Huffman tree with symbols | + +### Quantization type enumeration +Type is encoded using a single byte. +- `0` - Scalar Quantization +- `1` - Vector Quantization 1D (Row vector) +- `2` - Vector Quantization 2D (Matrix vector) +- `3` - Vector Quantization 3D (Tensor vector) \ No newline at end of file -- GitLab