Introduction to GlusterFS (File System) and Installation on RHEL/CentOS and Fedora

We are living in a world where data is growing in an unpredictable way and it our need to store this data, whether it is structured or unstructured, in an efficient manner. Distributed computing systems offer a wide array of advantages over centralized computing systems. Here data is stored in a distributed way with several nodes as servers.

GlusterFS Storage

GlusterFS Storage

The concept of a metadata server is no longer needed in a distributed file system. In distributed file systems, it offers a common view point of all the files separated among different servers. Files/directories on these storage servers are accessed in normal ways.

For example, the permissions for files/directories can be set as in usual system permission model, i.e. the owner, group and others. The access to the file system basically depends on how the particular protocol is designed to work on the same.

What is GlusterFS?

GlusterFS is a distributed file system defined to be used in user space, i.e. File System in User Space (FUSE). It is a software based file system which accounts to its own flexibility feature.

Look at the following figure which schematically represents the position of GlusterFS in a hierarchical model. By default TCP protocol will be used by GlusterFS.

GlusterFS Design

GlusterFS Design

Advantages to GlusterFS

  1. Innovation – It eliminates the metadata and can dramtically improve the performance which will help us to unify data and objects.
  2. Elasticity – Adapted to growth and reduction of size of the data.
  3. Scale Linearly – It has availability to petabytes and beyond.
  4. Simplicity – It is easy to manage and independent from kernel while running in user space.

What makes Gluster outstanding among other distributed file systems?

  1. Salable – Absence of a metadata server provides a faster file system.
  2. Affordable – It deploys on commodity hardware.
  3. Flexible – As I said earlier, GlusterFS is a software only file system. Here data is stored on native file systems like ext4, xfs etc.
  4. Open Source – Currently GlusterFS is maintained by Red Hat Inc, a billion dollar open source company, as part of Red Hat Storage.

Storage concepts in GlusterFS

  1. Brick – Brick is basically any directory that is meant to be shared among the trusted storage pool.
  2. Trusted Storage Pool – is a collection of these shared files/directories, which are based on the designed protocol.
  3. Block Storage – They are devices through which the data is being moved across systems in the form of blocks.
  4. Cluster – In Red Hat Storage, both cluster and trusted storage pool convey the same meaning of collaboration of storage servers based on a defined protocol.
  5. Distributed File System – A file system in which data is spread over different nodes where users can access the file without knowing the actual location of the file. User doesn’t experience the feel of remote access.
  6. FUSE – It is a loadable kernel module which allows users to create file systems above kernel without involving any of the kernel code.
  7. glusterd – glusterd is the GlusterFS management daemon which is the backbone of file system which will be running throughout the whole time whenever the servers are in active state.
  8. POSIX – Portable Operating System Interface (POSIX) is the family of standards defined by the IEEE as a solution to the compatibility between Unix-variants in the form of an Application Programmable Interface (API).
  9. RAID – Redundant Array of Independent Disks (RAID) is a technology that gives increased storage reliability through redundancy.
  10. Subvolume – A brick after being processed by least at one translator.
  11. Translator – A translator is that piece of code which performs the basic actions initiated by the user from the mount point. It connects one or more sub volumes.
  12. Volume – A volumes is a logical collection of bricks. All the operations are based on the different types of volumes created by the user.
Different Types of Volumes

Representations of different types of volumes and combinations among these basic volume types are also allowed as shown below.

Distributed Volume

Distributed Volume

Replicated Volume

Replicated Volume

Striped Volume

Striped Volume

Distributed Replicated Volume

Representation of a distributed-replicated volume.

Distributed Replicated Volume

Distributed Replicated Volume

Installation of GlusterFS in RHEL/CentOS and Fedora

In this article, we will be installing and configuring GlusterFS for the first time for high availability of storage. For this, we’re taking two servers to create volumes and replicate data between them.

Step :1 Have at least two nodes

  1. Install CentOS 6.5 (or any other OS) on two nodes.
  2. Set hostnames named “server1” and “server2“.
  3. A working network connection.
  4. Storage disk on both nodes named “/data/brick“.

Step 2: Enable EPEL and GlusterFS Repository

Before Installing GlusterFS on both the servers, we need to enable EPEL and GlusterFS repositories in order to satisfy external dependencies. Use the following link to install and enable epel repository under both the systems.

  1. How to Enable EPEL Repository in RHEL/CentOS

Next, we need to enable GlusterFs repository on both servers.

# wget -P /etc/yum.repos.d http://download.gluster.org/pub/gluster/glusterfs/LATEST/EPEL.repo/glusterfs-epel.repo

Step 3: Installing GlusterFS

Install the software on both servers.

# yum install glusterfs-server

Start the GlusterFS management daemon.

# service glusterd start

Now check the status of daemon.

# service glusterd status
Sample Output
service glusterd start
  service glusterd status
  glusterd.service - LSB: glusterfs server
   	  Loaded: loaded (/etc/rc.d/init.d/glusterd)
  	  Active: active (running) since Mon, 13 Aug 2012 13:02:11 -0700; 2s ago
  	 Process: 19254 ExecStart=/etc/rc.d/init.d/glusterd start (code=exited, status=0/SUCCESS)
  	  CGroup: name=systemd:/system/glusterd.service
  		  ├ 19260 /usr/sbin/glusterd -p /run/glusterd.pid
  		  ├ 19304 /usr/sbin/glusterfsd --xlator-option georep-server.listen-port=24009 -s localhost...
  		  └ 19309 /usr/sbin/glusterfs -f /var/lib/glusterd/nfs/nfs-server.vol -p /var/lib/glusterd/...

Step 4: Configure SELinux and iptables

Open ‘/etc/sysconfig/selinux‘ and change SELinux to either “permissive” or “disabled” mode on both the servers. Save and close the file.

# This file controls the state of SELinux on the system.
# SELINUX= can take one of these three values:
#     enforcing - SELinux security policy is enforced.
#     permissive - SELinux prints warnings instead of enforcing.
#     disabled - No SELinux policy is loaded.
SELINUX=disabled
# SELINUXTYPE= can take one of these two values:
#     targeted - Targeted processes are protected,
#     mls - Multi Level Security protection.
SELINUXTYPE=targeted

Next, flush the iptables in both nodes or need to allow access to the other node via iptables.

# iptables -F

Step 5: Configure the Trusted Pool

Run the following command on ‘Server1‘.

gluster peer probe server2

Run the following command on ‘Server2‘.

gluster peer probe server1

Note: Once this pool has been connected, only trusted users may probe new servers into this pool.

Step 6: Set up a GlusterFS Volume

On both server1 and server2.

# mkdir /data/brick/gv0

Create a volume On any single server and start the volume. Here, I’ve taken ‘Server1‘.

# gluster volume create gv0 replica 2 server1:/data/brick1/gv0 server2:/data/brick1/gv0
# gluster volume start gv0

Next, confirm the status of volume.

# gluster volume info

Note: If in-case volume is not started, the error messages are logged under ‘/var/log/glusterfs‘ on one or both the servers.

Step 7: Verify GlusterFS Volume

Mount the volume to a directory under ‘/mnt‘.

# mount -t glusterfs server1:/gv0 /mnt

Now you can create, edit files on the mount point as a single view of the file system.

Features of GlusterFS

  1. Self-heal – If any of the bricks in a replicated volume are down and users modify the files within the other brick, the automatic self-heal daemon will come into action as soon as the brick is up next time and the transactions occurred during the down time are synced accordingly.
  2. Rebalance – If we add a new brick to an existing volume, where large amount of data was previously residing, we can perform a rebalance operation to distribute the data among all the bricks including the newly added brick.
  3. Geo-replication – It provides back-ups of data for disaster recovery. Here comes the concept of master and slave volumes. So that if master is down whole of the data can be accessed via slave. This feature is used to sync data between geographically separated servers. Initializing a geo-replication session requires a series of gluster commands.

Here, is the following screen grab that shows the Geo-replication module.

Geo Replication

Geo Replication

Reference Links

GlusterFS Homepage

That’s it for now!. Stay updated for the detailed description on features like Self-heal and Re-balance, Geo-replication, etc in my upcoming articles.

If You Appreciate What We Do Here On TecMint, You Should Consider:

TecMint is the fastest growing and most trusted community site for any kind of Linux Articles, Guides and Books on the web. Millions of people visit TecMint! to search or browse the thousands of published articles available FREELY to all.

If you like what you are reading, please consider buying us a coffee ( or 2 ) as a token of appreciation.

Support Us

We are thankful for your never ending support.

Anoop C S

I am basically a FOSS enthusiast interested in working under GNU/Linux and system administration. Looking forward to become a part of an open source initiative. Currently pursue Computer Science & Engineering.

Your name can also be listed here. Got a tip? Submit it here to become an TecMint author.

RedHat RHCE and RHCSA Certification Book
Linux Foundation LFCS and LFCE Certification Preparation Guide

You may also like...

33 Responses

  1. duc pham says:

    Hello, I want to know the way processing store file on storage servers. when i deploy glusterfs with replicated volume, the client will send a file to all storage servers or the client will send a file to one server then the server will send to another.

    • Anoop C S says:

      The whole logic of file distribution and replication resides on the client side stack of GlusterFS. For example, client will perform a write operation to both servers in a replica set of 2 . There is no communication between the servers(or bricks in terms of GlusterFS) themselves.

      But with GlusterFS 4.0, overall structure is intended to change which also includes the concept of server-side replication termed as JBR(Journal Based Replication). More details on JBR can be found at Red Hat – Next Generation File Replication system in Gluster FS

      Stay tuned to GlusterFS mailing lists for more updates or jump into #gluster on Freenode IRC.

  2. Mike Kormendy says:

    It might be nice to put in big bold letters that these steps are for VOLUME REPLICATION ONLY. The other volume types and setup are not explained fully here at all.

  3. Ashish says:

    Hi Anoop ,

    I have followed the article , it was nice that everything works perfect , I have some question regarding glusterfs as
    1. I am using two webserver and two storage server , storage server is configured with glusterfs , both are replicating fine with the configurations , I need to mount common share directory which is “gv0” according to your article , I have installed the gluster client on the webserver ,and mount server1:/gv0 to webserver , after this step , replication suddenly stops , as I have mounted node1 on websever when i am creating any file in the gluster server1 then it is not accessible in webserver as well . so I am stucked with this

    2. can you please suggest how gluster allows high availablity , as i am mounting node1 on webserver , if node1 will be down then how glusterfs will manage files ???

    Thanks,

    Ashish

    • Anoop C S says:

      Thanks for your good words. Answering your questions:

      [1] You must always create/modify/delete files inside GlusterFS through mount points. As far as I can understand from your explanation you have attempted to create a file directly on one of your storage server which is a wrong way to do things and can result in failures.

      [2] Even if you mount via node1 on your web server, GlusterFS client process will fetch the required configuration files(.vol files) from node1 and establishes connections to both node1 and node2 with the web server from which the mount was done. Thus client can still live with the other server node if one of them is down. Self-healing of files will take place as soon as the offline mode becomes online. You can refer the second part in this series [https://www.tecmint.com/perform-self-heal-and-re-balance-operations-in-gluster-file-system/] to get more insight.

      Please let me know if you are not satisfied with the above explanation.

  4. Rishivyas says:

    Nice guide i have followed the and created gfs volume replicated
    over 2 nodes (2 nos of i3 system with 4gb ram 2 SATA hdd each 1
    for os and another for gluster volume) .Replication works fine and created
    samba share on that volume but the data transfer rate to that share is
    pathetic (max 3mb speed)from windows7 or 8 system .when i mod prob fuse, there is no
    error at all.Do i miss anything ? Pl help

    • Anoop C S says:

      I think you are using fuse mount to access gluster volume as Samba share. If yes, you do have an alternative way of accessing gluster volumes from windows clients via our API, aka libgfapi, as described in the following link.
      https://lalatendumohanty.wordpress.com/2014/02/11/using-glusterfs-with-samba-and-samba-vfs-plugin-for-glusterfs-on-fedora-20/

      I am always ready to help in case of any errors with Samba shares for gluster volumes.

      • Rishivyas says:

        Thank You Anoop for your response .I have followed the guide you have referred with samba-vfs-plugin but getting maximum speed of 11MB . Created vol using strip for performance since we have plans to store and use large files (size of about 1gb) .Is there any thing that i am missing technically.As i mentiond i am using 2 i3 systems with 2 hdd each ,one hdd from each is dedicated to create gfs volume.Can we achieve 100MB or more with GFS,Samba-vfs and this hardware? I read many article which makes me think possible.

        • Anoop C S says:

          Can you check for speed with an XFS partition instead of GlusterFS and share your results? If you see much better performance on XFS, then it’s something to look into more deeply.

          I will wait for your reply. Along with the results please provide the following details:

          * Operating system
          * GlusterFS version
          * Samba version
          * Output of the following command:
          gluster volume info
          * GlusterFS volume share section from smb.conf
          * Any suspicious warning, error logs from gluster side
          * Whether Samba is running independently from Gluster nodes?

  5. paul says:

    thank you so much for the article, it really help me. but I have one problem here, after i reboot my pc i cant mount the server anymore, it works really well before I reboot it.

    “/usr/bin/fusermount-glusterfs: mount failed: No such device
    Mount failed. Please check the log file for more details.”

    I use ip address of the server instead of hostname, do you have any idea? thank you

    • Anoop C S says:

      Some first come thoughts:
      [1] Verify the network route to server from the client.
      [2] Make sure that the volume is started state. (bricks are up or not).
      [3] Check whether GlusterFS management daemon (glusterd) is running on the server while mount command is issued.
      [4] If you are using fstab to mount automatically on reboot, please provide me the corresponding entry.

      If everything is up and running, would you mind providing the client log in some way (I prefer fpaste) for further investigation and also the glusterfs version you are running on server?

      • paul says:

        solve it, thanks. I have to add “modprobe –first-time fuse” and restart all containers to make it work everytime I reboot it.

  6. nando says:

    i cannot mount the volume, it said “Mount failed. Please check the log file for more details.” any idea why? thanks

    • Anoop C S says:

      Hi nando,

      Please check the following:
      [1] Make sure that the required volume is started. {gluster volume status }
      [2] Check whether glusterfs daemon(glusterd) is running on the IP you used to mount
      [3] Make sure that the mount command is ran correctly. {mount -t glusterfs :/ }

      If above things are fine, can you please fpaste the mount log i.e, /var/log/glusterfs/.log and reply here with the link. It would be better if you can provide the output of gluster volume info, gluster volume status and the whole command used to mount the glusterfs volume.

  7. Vaibhav Kanchan says:

    Hi Anoop.. first of all many thanks for such a simple and informative article on glusterfs.

    I have one doubt.. If I build glusterfs on 2 nodes (server1 and server2) and mount it on client as server1:/gv0 .

    if server1 goes down due to some reason.. we need to again mount it on client as server2:/brick to restore functionality. Am I correct?

    Is there a way for client to automatically fix the above issue and mount it as server2:/brick .. I mean failover to server 2 till server1 comes online..

    • Anoop C S says:

      Hi Vaibhav,
      Thanks and let me start with explaining the background for your doubt.

      The example listed in the article explains the glusterfs native mount (FUSE mount). As soon as you mount the volume on client, it establishes direct connections between glusterfsd processes (which are the brick processes on servers). After mounting, if somehow glusterd (GlusterFS management daemon) got killed on the server which was used during mount, it doesn’t affect the I/O. Hereafter the I/O request/responses are exchanged between glusterfs (mount process) and glusterfsds (brick processes).

      Now coming to your question.. What happens if the whole server went down which obviously kills glusterd and possibly glusterfsds (if there are bricks running on that server)? Remember that GlusterFS is a distributed file system. Unless you have configured a replica cluster, data residing on the brick (for which it went down) will not be listed on mount and I/O for files on that brick will fail. Even in that case mount still exists and will show the contents from other bricks (if there exists more than one brick). Here comes the importance of replica cluster.

      Suppose you managed to setup a replica cluster over server1 and server2 and client mounts the volume using server1. In this scenario even if server1 goes offline, it will not cause any I/O to fail because the other replica is still alive. As soon as server1 comes back online it automatically performs the healing operations to maintain the consistency of data across the volume.

      All the above facts are related to FUSE mounts. The concept of fail-over in GlusterFS has been integrated with NFS exports and Samba shares with the help of pacemaker, corosync and pcs (High Availability Cluster) and it is an ongoing development in Gluster community. There you can configure VIPs (Virtual IPs) in order to handle fail-over cases. You can always reach out to #gluster on freenode or subscribe to gluster-users mailing list for further details on High Availability Resource Management in GlusterFS.

  8. Dmytro says:

    I wish this articale could be more detailed and new-bee oriented. I’m stuck at mkdir /data/brick/gv0 – how do I say to my sdb1 that it it’s now a brick?

    • Anoop C S says:

      I don’t get you clearly. /data/brick/gv0 is just a export directory. In GlusterFS semantics, it is called as a brick. If I understand correctly, you can always mount /dev/sdb1 and used sub-directories under sdb1 as bricks. You need to specify exact path while creating the volume.

Got something to say? Join the discussion.

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.