Archiving and compression in Linux
- — 29 November, 2002 09:05
Archiving and compressing files is a very common task. Whether it is for backup purposes or exchanging files over the Internet, being able to handle the various archive and compression formats available is an important skill. Linux has tools to access all common archive and compression file formats.
Windows formats
Zip is the most popular compression format on any platform. The Zip format was originally created by PKWare in the 1980s and is now supported by hundreds of programs and used primarily to distribute software over the Internet. Linux uses Info-Zip (www.info-zip.org) to access Zip files. Tools to archive, compress and decompress files are included.
To create a Zip archive containing several files, use the following command:
$ zip archive.zip [files to zip separated by spaces]
To decompress a zip archive, use the following command:
$ unzip archive.zip
RAR is a very popular format for distributing large files on the Internet because it supports multi-volume archiving. This allows a single large archive to be broken into several RAR files of an identical size, commonly referred to as ‘disks’. This makes it much easier to transfer the archive using a relatively slow Internet connection such as a modem.
RAR files are created in sequential order, commencing with the suffix .rar for the first file, followed by .r00, .r01, etc. for each subsequent archive.
To create a multi-volume RAR archive, use the following command:
$ rar a -v
The ‘a’ command tells RAR to create an archive. The ‘-v’ switch splits the archive into disks of
To decompress an RAR archive, place all of the disks in the current directory and type:
$ rar x archive.rar
The ‘x’ command tells RAR to extract the archive with full paths intact. RAR will automatically decompress the disks in the correct order.
UNIX formats
Under UNIX, archiving and compressing tasks are given to different programs.
Tape ARchiver or tar, is the standard archiving tool for UNIX, and was originally designed to be used with tape backups. The tar program takes a list of files and combines them into a single file. No compression is applied so the file will take up approximately the same amount of space as the original files.
To create a tar archive, use the following command:
$ tar cvf archive.tar [files to tar separated by spaces]
The ‘c’ switch tells tar to create a new archive. The ‘v’ switch turns on verbose mode to make diagnosing errors easier. The ‘f’ switch specifies that the archive filename follows.
To extract a tar archive into the original list of files, use the following command:
$ tar xvf archive.tar
The ‘x’ switch tells tar to extract files from an archive. The other switches have the same functionality as above.
GZip is a popular compression tool, and files compressed this way end in the suffix .gz. It is most commonly applied to tar archives, resulting in files ending with the suffix .tar.gz.
To create a GZip compressed archive, first create a tar archive of the files you wish to compress and then use the following command:
$ gzip archive.tar
To extract a GZip compressed file, use the following command:
$ gunzip archive.tar.gz
You can also take advantage of UNIX pipes to use gunzip and tar together:
$ gunzip -c archive.tar.gz | tar xv
The ‘-c’ switch outputs the contents of the decompressed archive.tar.gz to the tar command, which then extracts the individual files from the archive.
BZip2 offers significantly better compression than GZip and requires a lot more CPU power. BZip2 is best used to compress large files where the space savings are worth the extra time taken to compress the file.
To create a BZip2 compressed archive, first create a tar archive of the files you wish to compress and then use the following command:
$ bzip2 archive.tar
To extract a BZip2 compressed archive, use the following command:
$ bunzip2 archive.tar.bz2
You can also use take advantage of UNIX pipes to use BZip2 and tar together:
$ bunzip2 -c archive.tar.bz2 | tar xv


