Tar command Compress and Extract Archives

This tutorial describes the meaning of archive and compression in Linux file systems. Later, it explains the tools you can use to create and manage compressed archives through examples.

Archive and compression

Although archive and compression are used together, they are different. An archive is a collection of files. It allows you to manage several files as a single file. It is mainly used for taking backups and moving files. For example, if you want to transfer 100 files from one system to another, you can create an archive of these files and move them as a single file. Tar is an archive tool. It allows you to create and manage archives.

archive explained

Compression is a technique that reduces the file size. It uses many algorithms and methods to reduce the file size. It allows you to save more files on a smaller disk space. For example, if you can save ten files, each file is 10MB on a 100MB partition, you can save fifteen to seventeen files of the same size on the same partition. Gzip and bzip2 are compression tools. They allow you to compress files.

compression explained

Administrators usually compress files before adding them to an archive. It saves disk space (in case of backup) and network bandwidth (in case of file transfer).

Tar, gzip and bzip2

The tar command does not have an inbuilt functionality to compress files before adding them to the archive. However, it can use third-party compression tools such as gzip and bzip2 to compress files before adding them to the archive. Gzip compresses faster but provides a low compression ratio. Bzip2 compresses slower but provides a high compression ratio.

Using gzip and bzip2 to compress tar archives

The g option instructs the tar command to use the gzip utility to compress files before adding them to the archive. You can use the .gz extension with the archived file name to indicate the gzip compression.

To create and compress the archive file with the bzip2 utility, use the j option. Use the .bz2 file extension with the archived file name. For example, the following commands create the home.tar archive from the /home directory without compression and with the gzip and bzip2 compressions.

#tar -cvf /tmp/home.tar /home
#tar -czvf /tmp/home.tar.gz /home
#tar -cjvf /tmp/home.tar.bz2 /home

After creating archives, use the du command with the h option to verify the compression used in archives. The du command with the h option shows the file size of the specified file.

Creating a compressed archive

If you use compression while creating an archive, you can not append, update, or delete a single file from the created archive. If you do, the command will return the following error.

Tar: Error is recoverable: exiting now

tar command error Tar: Error is recoverable: exiting now

Extracting a compressed archive file

The x option extracts an archived file. The v option displays the real-time progress. The f option specifies the archived file you want to extract. If the archive file is compressed with the gzip, use the z option. Use the j option if the file is compressed with the bzip2.

Use the following syntax to extract an archive compressed with gzip.

#tar -xzvf  [archived-file-name] [destination-directory]

Use the following syntax to extract an archive compressed with bzip2.

#tar -xjvf  [archived-file-name] [destination-directory]

Do not specify the compression-related options if you do not know the compression utility used to compress the archive file. Suppose the specified archive file contains the compressed files, and you do not specify the compression-related options. In that case, tar automatically selects and uses the appropriate tool to decompress the compressed files before extracting them. For example, to extract the compressed archive data.tar.gz, you can use any command from the following commands.

#tar -xvf data.tar.gz
#tar -xzvf data.tar.gz

Skipping compression-related options is best if you do not know the compression utility used to compress the archived file. It forces tar to automatically detect the compression type and, based on the result, uses the related tool. If you use an incorrect option, the command uses specified tool to decompress the archive. Since it uses the wrong tool for decompression, it fails. For example, the following command fails, as it instructs tar to use the bzip2 utility to decompress the archived file compressed with the gzip utility.

#tar -xjvf data.tar.gz

Extracting a compressed archive

Preserving SELinux context

SELinux is a security feature. It protects files from unauthorized modification. It saves additional information in the form of attributes for each file. There are two types of attributes: regular and extended. Regular attributes contain essential information such as owner info, access permission, created date, etc. Extended attributes save information about additional features such as SELinux security contexts.

By default, the tar command does not retrain the extended attributes. However, it supports options that allow you to instruct it to preserve specific attributes. For example, the --selinux option saves SELinux contexts. Let us take an example.

  • Create two directories: reg and ext.
  • Switch to the directory /var/www/html and create some files for testing.
  • Create two archive files and add the files created in the above step. Use the regular options for the first file. Use the --selinux option with the regular options for the second file.
  • Copy both archived files to the directories (reg and ext) created in the first step.
  • Switch to both directories one by one and extract the copied archive files.
  • Use the --selinux option to extract the second archive file.

Preserving SELinux context

Compare the SELinux contexts of the extracted files with the original files.

Example of SELinux context tar archive

The above output verifies that tar preserved the SELinux context when the --selinux option is used.

This tutorial is part of the tutorial "The tar Command and its Options (c,v, f) Explained with Examples.". Other parts of this tutorial are as follows:

Chapter 1  Tar command options and syntax explained
Chapter 2   Tar command examples in Linux
Chapter 3   Tar command Compress and Extract Archives

Conclusion

The tar command is one of the most commonly used commands for creating archives. Archives allow us to manage several files and directories as a single file. The tutorial explains how to preserve the SELinux context while creating and extracting a tar archive file. It also explained how to compress files before adding them to the archive to reduce the archived file's size.

ComputerNetworkingNotes Linux Tutorials Tar command Compress and Extract Archives

We do not accept any kind of Guest Post. Except Guest post submission, for any other query (such as adverting opportunity, product advertisement, feedback, suggestion, error reporting and technical issue) or simply just say to hello mail us ComputerNetworkingNotes@gmail.com