Open source implementations of Microsoft compression algorithms
Open source implementations of Microsoft compression algorithms. The progress is listed below. "RTL" refers to the native RtlCompressBuffer and RtlUncompressBuffer functions from Windows 8's ntdll.dll. Comparisons are made against the max compression engine for RTL functions.
The quoted speeds are when running on Windows with a Core i7-3720QM 2.6 GHz CPU compiled using MinGW-W64/GCC v4.9.2 using profile-guided optimizations. A total of 54 files were compressed, totaling 269 MB. Each file was done 10 times. The files came from the following collections: * Calgary Corpus * Canterbury Corpus * Canterbury Large Corpus * Maximum Compression's single file tests * Silesia Corpus
For comparison, the library includes a "no compression" engine and it operates at ~2000 MB/s in the same testing environment.
Other Microsoft compression algorithms not included (at least not yet): * LZSS - used by COMPRESS.EXE and in HLP files, decompressor available as part of libmspack * LZSS+Huffman - used by COMPRESS.EXE, decompressor available as part of libmspack * MSZIP - essentially "deflate" (zlib) algorithm used in CAB files * Quantum - used in some very rare CAB files, not mentioned in the MSDN at all, decompressor available as part of libmspack * MS-OXRTFCP - RTF Compression Algorithm used in the Exchange server - possibly similar to LZNT1 * Delta/LZXD - delta version of LZX used in Windows Updates, decompressor available as part of libmspack * RDC - similar to RSYNC * LZMS - recently introduced compression format used in WIM files, available in wimlib
Used for NTFS file compression, Windows 2000 hibernation file, Active Directory, File Replication Service, Windows Vista SuperFetch Files, and Windows Vista/7 bootmgr.
MSDN article [MS-XCA]: https://msdn.microsoft.com/library/hh554002.aspx
Which includes the algorithm
and an example
Status: fully mature - no more significant changes likely
Used for Windows XP and newer hibernation file, Directory Replication Service (LDAP/RPC/AD), Exchange Server HTML compression, Windows Update Services, and Windows CE.
MSDN article [MS-XCA]: https://msdn.microsoft.com/library/hh554002.aspx
Which includes compression
and decompression
pseudo-code along with an example.
Status: working - decompression is fully mature but compression needs speed improvements and does not support streaming
Xpress algorithm with Huffman encoding, used in WIM files, Distributed File System Replication, Windows 7 SuperFetch, and Windows 8 bootmgr.
MSDN article [MS-XCA]: https://msdn.microsoft.com/library/hh554002.aspx
Which includes compression
and decompression
details along with an example.
Additionally, a mostly complete pseudo-code decompression implementation is given at: https://msdn.microsoft.com/library/dd644740.aspx
Status: working - needs major speed improvements, does not create optional chunk boundary spanning matches, and does not support streaming for compression or decompression
LZX compression used in WIM and CAB files with some minor differences between them.
NOTE: this code is currently removed from the repository due to stability issues.
Microsoft document about the CAB LZX format: http://msdn.microsoft.com/en-us/library/bb417343.aspx#lzxdatacompressionformat
Untested against native Windows functions (not part of RTL, need to test against CABINET.DLL (FCI/FDI) and WIMGAPI.DLL).
Status: in development - works in some cases, but fails frequently