The bzip2 and libbzip2 official home page

Click here for
more details

Click here to browse digistar.com
with 128-bit SSL Security


The master version of this page lives at http://sourceware.cygnus.com/bzip2/, and new stuff, mainly executables, will appear there first.

The current stable version of bzip2 is 1.0.1.

What is bzip2?

bzip2 is a freely available, patent free (see below), high-quality data compressor. It typically compresses files to within 10% to 15% of the best available techniques (the PPM family of statistical compressors), whilst being around twice as fast at compression and six times faster at decompression.

Why would I want to use it?

The code is organised as a library, with a programming interface.  The bzip2 program itself is a client of the library.  You can use the library in your own programs, to directly read and write .bz2 files, or even just to compress data in memory using the bzip2 algorithms.

Getting the latest version: bzip2-1.0.1

See below for what's new in 1.0.0.  1.0.0 is an improvement over 0.1pl2, 0.9.0 and 0.9.5, but the file format is unchanged, so the four versions should interwork fine.  1.0.1 is identical to 1.0.0, except that a couple of obscure build problems on Windows platforms have been fixed, and there are some minor documentation updates.  If you have a working 1.0.0, upgrading to 1.0.1 is not necessary.

Executables

First off, here are some executables I've collected.  I hope to expand this list over time.   Because 1.0.0 is pretty new, this list is very small.  If your system isn't listed, there may be an older version available: see the next section.  As with previous releases, I will expand this list as people donate executables for other systems. Please read the notes on executables before downloading.  You might avoid some common problems.

Libraries

There's increasing demand for the library as a DLL (Win32) or as Unix dynamic shared objects (.so's).  Here are some.  Once again, please read the notes on executables before downloading.  Linux users, you first need to find out which libc version you have, by doing 'ls /lib/*libc-*'.

Sources

Here's the source code, including full documentation.  For the paranoid, some MD5 sums:

   11fe7b9615eb84326712cb41671a7103  v01pl2/bzip2-0.1pl2.tar.gz
   29993af5282e817fafc5a76b4e0c98fa  v090/bzip2-0.9.0c.tar.gz
   8a3f6d1d9e4072bb3c7aeae6578ae6ca  v095/bzip2-0.9.5d.tar.gz
   770135dc94369cb3eb6013ed505c8dc5  v100/bzip2-1.0.1.tar.gz

If you can be bothered, please email me to say you've got a copy.   It's nice to know where this stuff gets to.

Getting an older version: bzip2-0.9.5d or bzip2-0.9.0c

Although older, these versions should work fine, unless you need large (> 2GB) file support.  Please read the notes on executables before downloading.  You might avoid some common problems. The following, larger, collection is for 0.9.0. If your machine isn't listed here, don't despair.  bzip2 is very portable.  It should run on practically any 32 or 64 bit computer, if you have enough spare memory (at least 8 megabytes).  If you have an ANSI C compiler, you have a very good chance of building a working executable from the sources with minimal difficulty.

TO USE: Rename the file you've got to plain "bzip2" (or "bzip2.exe", on Win95/98/NT/2000), and use it.

To decompress a .bz2 file, do "bzip2 -d my_file.bz2".  Remember, the one program does both compression and decompression.  To get decompression by default, copy "bzip2.exe" to "bunzip2.exe" (Win95/98/NT/2000), or symlink "bzip2" to "bunzip2" (Unix users).

Some notes on executables:

Documentation

Here's the HTML version of the complete manual, unfortunately lacking the license page due to some oddity of texi2html.  And here's the postscript.

Many people have asked about Y2K issues in bzip2.  Here's a short statement.
 

What's new in 1.0.0 ?

The CHANGES files gives more details.
 

What's new in 0.9.5 ?

Not many big changes.  Mostly a slow evolution of 0.9.0 into something more robust.  Still, you should try and move to 0.9.5 as and when you can.

What's new in 0.9.0 ?

0.9.0 is the first public version since 0.1pl2.  The central feature of 0.9.0 is that the code has been completely reorganised, so that the main compress/decompress machinery is in a library.  The bzip2 program is now merely a wrapper on top of the library.  I've also incorporated various small speedups, functionality enhancements and portability things -- mostly stuff that was frequently requested in your feedback.

Note that the .bz2 file format is unchanged, so 0.9.0 is fully forwards and backwards compatible with 0.1pl2.

Specific changes:

Contributed stuff

A patch for GNU tar 1.13 so you can make it compress with bzip2.  The relevant flags are -y or --bzip2 or --bunzip2.  From Kevin Ivory and David Fetter and modified by Thomas Bucholz.  Several other people also sent patches; thank you for them.

David Fetter maintains a bzip2-HOWTO document.
 

What's your day job?

I'm an (experimental) compiler-writer by trade.  At the moment I work as a research assistant for Glasgow University, helping develop a compiler for the functional language Haskell.  The Glasgow Haskell compiler serves as a testbed for research into Haskell, and at the same time is a stable, well regarded, freely available, state of the art optimising compiler for Haskell.  It's available for most major platforms.  Perhaps you'd care to give it a spin.  We're close to releasing version 4.07 of our compiler and supporting tools.  It's open source.  Naturally.

In the more distant past, I worked for five years on parallelising compilers for functional languages at the University of Manchester, UK.   I'm a big fan of Haskell, an elegant and useful functional language.  Getting a bit bored with C?  Try doing some lazy functional programming in Haskell.  It'll change the way you think about programming.  Permanently.

I'm a member of the ACM, which I think is a fine organisation. You can reach me by email through ACM, or via a more direct route.
 

Other stuff I did: cacheprof

Memory effects have a big effect on the performance of programs -- especially bzip2.  I tried and failed to find a decent, open-source tool which would tell me exactly which lines of code produce cache misses, and in the end I wrote my own.  It's a useful performance analysis tool, and I think it totally Kicks Ass.  Your opinion may differ.  In any case, you can get it from http://www.cacheprof.org.
 

Julian Seward (jseward@acm.org).

Last updated  Friday, 23 June 2000.