RHCSA Series: How to Perform File and Directory Management – Part 2

Archiving, Compressing, Unpacking, and Uncompressing Files

If you need to transport, backup, or send via email a group of files, you will use an archiving (or grouping) tool such as tar, typically used with a compression utility like gzip, bzip2, or xz.

Your choice of a compression tool will be likely defined by the compression speed and rate of each one. Of these three compression tools, gzip is the oldest and provides the least compression, bzip2 provides improved compression, and xz is the newest and provides the best compression. Typically, files compressed with these utilities have .gz, .bz2, or .xz extensions, respectively.

Command Abbreviation Description
–create c Creates a tar archive
–concatenate A Appends tar files to an archive
–append r Appends non-tar files to an archive
–update u Appends files that are newer than those in an archive
–diff or –compare d Compares an archive to files on disk
–list t Lists the contents of a tarball
–extract or –get x Extracts files from an archive
Operation modifier Abbreviation Description
directory dir C Changes to directory dir before performing operations
same-permissions and same-owner p Preserves permissions and ownership information, respectively.
–verbose v Lists all files as they are read or extracted; if combined with –list, it also displays file sizes, ownership, and timestamps
exclude file Excludes file from the archive. In this case, file can be an actual file or a pattern.
gzip or gunzip z Compresses an archive through gzip
–bzip2 j Compresses an archive through bzip2
–xz J Compresses an archive through xz
Example 5: Creating a tarball and then compressing it using the three compression utilities

You may want to compare the effectiveness of each tool before deciding to use one or another. Note that while compressing small files, or a few files, the results may not show much differences, but may give you a glimpse of what they have to offer.

# tar cf ApacheLogs-$(date +%Y%m%d).tar /var/log/httpd/*        # Create an ordinary tarball
# tar czf ApacheLogs-$(date +%Y%m%d).tar.gz /var/log/httpd/*    # Create a tarball and compress with gzip
# tar cjf ApacheLogs-$(date +%Y%m%d).tar.bz2 /var/log/httpd/*   # Create a tarball and compress with bzip2
# tar cJf ApacheLogs-$(date +%Y%m%d).tar.xz /var/log/httpd/*    # Create a tarball and compress with xz
Linux tar command examples

tar command examples

Example 6: Preserving original permissions and ownership while archiving and when

If you are creating backups from users’ home directories, you will want to store the individual files with the original permissions and ownership instead of changing them to that of the user account or daemon performing the backup. The following example preserves these attributes while taking the backup of the contents in the /var/log/httpd directory:

# tar cJf ApacheLogs-$(date +%Y%m%d).tar.xz /var/log/httpd/* --same-permissions --same-owner

Create Hard and Soft Links

In Linux, there are two types of links to files: hard links and soft (aka symbolic) links. Since a hard link represents another name for an existing file and is identified by the same inode, it then points to the actual data, as opposed to symbolic links, which point to filenames instead.

In addition, hard links do not occupy space on disk, while symbolic links do take a small amount of space to store the text of the link itself. The downside of hard links is that they can only be used to reference files within the filesystem where they are located because inodes are unique inside a filesystem. Symbolic links save the day, in that they point to another file or directory by name rather than by inode, and therefore can cross filesystem boundaries.

The basic syntax to create links is similar in both cases:

# ln TARGET LINK_NAME               # Hard link named LINK_NAME to file named TARGET
# ln -s TARGET LINK_NAME            # Soft link named LINK_NAME to file named TARGET
Example 7: Creating hard and soft links

There is no better way to visualize the relation between a file and a hard or symbolic link that point to it, than to create those links. In the following screenshot you will see that the file and the hard link that points to it share the same inode and both are identified by the same disk usage of 466 bytes.

On the other hand, creating a hard link results in an extra disk usage of 5 bytes. Not that you’re going to run out of storage capacity, but this example is enough to illustrate the difference between a hard link and a soft link.

Difference between a hard link and a soft link

Difference between a hard link and a soft link

A typical usage of symbolic links is to reference a versioned file in a Linux system. Suppose there are several programs that need access to file fooX.Y, which is subject to frequent version updates (think of a library, for example). Instead of updating every single reference to fooX.Y every time there’s a version update, it is wiser, safer, and faster, to have programs look to a symbolic link named just foo, which in turn points to the actual fooX.Y.

Thus, when X and Y change, you only need to edit the symbolic link foo with a new destination name instead of tracking every usage of the destination file and updating it.

Summary

In this article we have reviewed some essential file and directory management skills that must be a part of every system administrator’s tool-set. Make sure to review other parts of this series as well in order to integrate these topics with the content covered in this tutorial.

Feel free to let us know if you have any questions or comments. We are always more than glad to hear from our readers.

If You Appreciate What We Do Here On TecMint, You Should Consider:

TecMint is the fastest growing and most trusted community site for any kind of Linux Articles, Guides and Books on the web. Millions of people visit TecMint! to search or browse the thousands of published articles available FREELY to all.

If you like what you are reading, please consider buying us a coffee ( or 2 ) as a token of appreciation.

Support Us

We are thankful for your never ending support.

Gabriel Cánepa

Gabriel Cánepa is a GNU/Linux sysadmin and web developer from Villa Mercedes, San Luis, Argentina. He works for a worldwide leading consumer product company and takes great pleasure in using FOSS tools to increase productivity in all areas of his daily work.

Your name can also be listed here. Got a tip? Submit it here to become an TecMint author.

RedHat RHCE and RHCSA Certification Book
Linux Foundation LFCS and LFCE Certification Preparation Guide

You may also like...

20 Responses

  1. Anurag says:

    I have not understand why we used second grep in below example.if first grep is capable of providing details of all apache process , what is use of second one ?

    # ps -ef | grep apache | grep -v grep.

    • Gabriel A. Cánepa says:

      Compare the output without the second pipeline and the second grep and you’ll find your answer :). Hint: also check the purpose of the -v option of grep.

  2. Neeraj Rawat says:

    Hello Gabriel,

    Nice article, also correct the typo: Note that we use to(2) pipelines in the following example > Note that we use two(2) pipelines in the following example.

    • Gabriel A. Cánepa says:

      There is no typo. Compare the output without the second pipeline and the second grep and you’ll understand why I used two greps and two pipelines :). Hint: also check the purpose of the -v option of grep.

  3. Faisal says:

    thank u for the demonstration. I got ask u a question i.e i had a file 200M and created a hardlink for this file the actual disk space will 400M for both files or 200M.since u run ls -li will show both file point to same inode and they had the same size

    • @Faisal,
      The process of creating a hard link does NOT duplicate the file. So if you have a 200 MB file and create a hard link to it, the hard link itself will not occupy another extra 200 MB. One way you can check this is by doing
      du -sch /full/path/to/directory/*
      where /full/path/to/directory/ is the directory where you have the 200 MB file. Do that and then delete the hard link, then try again. You will not see a difference in the disk usage.

  4. Eduardo Ramos says:

    In the ‘hard & soft link’, maybe the phrase ‘On the other hand, creating a hard link results in an extra disk usage of 5 bytes.’ should be ‘On the other hand, creating a soft link results in an extra disk usage of 5 bytes.’, isn’t it?

  5. Cyrille La says:

    While ‘&> ‘ works in récent Bash,
    ‘2>&1 ‘ Will redirect stdout “to where stderr is” on close to every shell around.

  6. Adam B. says:

    Small mistake, in tar compression flags you have commit (should be uppercase J instead lowercase j for xz compression ).

    PS.
    Thanks for that two articles, for me is some kind of reapet …

  7. Henrik says:

    Thanks for share this content!

  8. Tomas says:

    This bit “2>>” should not be “Appends standard error to a file”?

  9. ahmedhsn says:

    Nice guide , I would like to know how I can get RHEL copy and how much is the price for it, please

Got something to say? Join the discussion.

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.