Recently I am quite often compressing huge CSV files which have lot of data redundancy. Unsatisfied with relation of compression time to compression effectiveness, I decided to Google little and find most promising data compression tool which are not mainstream but yet provide very good results (in both time and compression ratio.. but with more attention to the *ratio* thing). So here it goes:
It’s an experimental compression software, still with alpha release.
PAQ (paq8px) (http://en.wikipedia.org/wiki/PAQ)
It uses a series of compression algorithms. It’s main drawback it’s the time needed to compress something. But hey, it have won many times the Hutter Price (http://en.wikipedia.org/wiki/Hutter_Prize), a contest for compression programs
This is a classic one, and was my choice for long time. Have GUI for windows, supports utilizing multiple processor cores at once. 7zip uses LZMA. The Lempel–Ziv–Markov chain algorithm (LZMA) is an algorithm used to perform lossless data compression. It has been under development since 1998. The SDK history file states that it was in development from 1996, and first used in 7-Zip 2001-08-30. Aside from some unreferenced comments about 1998, the algorithm appears to have been unpublished before its use in 7-Zip. and was first used in the 7z format of the 7-Zip archiver. This algorithm uses a dictionary compression scheme somewhat similar to the LZ77 algorithm published by Abraham Lempel and Jacob Ziv in 1977 and features a high compression ratio (generally higher than bzip2) and a variable compression-dictionary size (up to 4 GB), while still maintaining decompression speed similar to other commonly used compression algorithms.