Grammar induction for text compression

Craig Saperstein presented his Part II project on using grammar induction for text compression.

Grammar induction is a way of doing lossless compression that has
been making recent progress. Variants are achieving 5-10% better
compression ratios on DNA sequences than gzip or bzip2 and comparable
performance on standard text data sets. I’ll be talking briefly about
the field, followed by an in-depth look at a particular grammar
induction method that I’m implementing for my part II project.