# Data comression project ## CZI parser TODO list This is list of things, which have to be done first: - [x] Parse `SubBlockDirectory` (there will be collection of `DirectoryEntryDV`'s) - [x] Exact copy of `DirectoryEntryDV` will be located in the referenced `SubBlock` - [x] Parse dimensions entries - [x] Parse IEEE 4 / 8 byte float. - Parsing is done via `memcpy` call. Other alternative is using `union`, but double parsing wass't working with it. Later we can take a look on this and maybe improve the conversion and get rid of the copy. - [x] Parse `SubBlock` - [ ] ~~Parse image data to proper pixel type~~. Do we really want to do that? We really care only about **bytes**. Parsing image data into some `PixelType` Matrix may be good, only if we want to do some operation on the image. This is moved to *later* section then. - [x] Parse / extract image data from `SubBlock` - We are aware of position and size of the data in CZI file. With that two informations we can extract the data easily. - [ ] Parse important informations from XML metadata (e.g. BitsPerPixel and compare with value from parsed binary data). - [ ] Parse image values into matrices. `Matrix<Gray8>` etc. - [ ] Support multi-file situations - [ ] Obtain multi-file CZI files. Currently we don't have any multi-files, so we can't really parse them. Secondary files should have different GUID than master file, also filepart should be different from 0. Files we have right now have (0) in their name, but theirs filepart is 0 and GUID of master file isn't set correctly. - [ ] One master file and more *secondary files* files (our current `CziFile` class kinda support that situations, so keep going that way) Later on, we can extend our program to handle more things from the file, like: - [ ] Parse metadata according to XML schemas - [ ] Take a look on binary reader, can it be fastened up? - [ ] Parse segments from memory buffer rather than from file stream. (*Disk bottleneck*) - [ ] Parse image data to PixelType matrix type. ## Compression of images - I tested [FLIF](https://flif.info/) compression on ~200 MB file. Compression ratio was good but the speed on the other hand was really bad. Decompression wasn't better, it was slow too. - Even lossy compression was slow. This compression is probably only good for small images. - **Next step is to test B3D compression.** - Pure LZW creates dictionary as it goes. Can create entries which aren't found in data. - LZ77 is better variant. - **Look at Huffman encoding, then maybe Arithmetic and in a spare time install nVidia drivers to test ot B3D library.** ## Space filling (*Peano*) curves - [Wikipedia](https://en.wikipedia.org/wiki/Space-filling_curve) - Find out what is this about - Look specifinally at [Z-order curve](https://en.wikipedia.org/wiki/Z-order_curve) and [Hilbert curve](https://en.wikipedia.org/wiki/Hilbert_curve) - ## Image difference - Find difference between included images. - Plot those differences and find out if certain images aren't same, just positioned little bit differently - *Pattern matching* - Find if there are some patterns in image sets.