biography research projects publications courses students home

data compression lab:

lossless source coding


related papers (to come)


Lossless or "noiseless" source coding is data compression for applications where the data must be represented efficiently but cannot be altered or distorted in the compression process. Data types that typically require lossless coding include text and computer exectuables, where any variation in the original data string may have catastrophic consequences. Lossless data compression is employed in a wide range of technologies, including data storage devices, modems, and fax machines.


The goal in studying lossless source coding theory is both to understand the best performance that could conceivably be achieved in lossless source coding and to understand how well current algorithms perform relative to those bounds. Current work in the Data Compression Laboratory focuses on the analysis of new and existing low-complexity universal lossless codes.


While theoretical analyses are useful for bounding the expected performance of a variety of lossless source codes, they do not tell the full story about algorithms considered. Work currently under way considers the design and implementation of practical universal lossless source codes and the comparison of their performances on text data to the performance of competing schemes on the same data sets.