How to Sync Cluster Configuration and Verify Failover Setup in Nodes – Part 4

Hello folks. First of all, my apologies for the delay of the last part of this cluster series. Let’s get on to work without getting any more delayed.

As we many of you have completed all three previous parts, I will brief you what we have completed so far. Now we already have enough knowledge to install and configure cluster packages for two nodes and enable fencing and failover in a clustered environment.

Sync Cluster Configuration and Verify FailOver
Sync Cluster Configuration and Verify FailOver – Part 4

You can refer my previous parts if you don’t remember since it took a little longer to post the last part.

Introduction to Linux Clustering and Advantages/Disadvanges of Clustering – Part 1

How to Install and Configure Cluster with Two Nodes in Linux – Part 2

Fencing and Adding a Failover to Clustering – Part 3

We will start by adding resources to the cluster. In this case we can add a file system or a web service as your need. Now I have /dev/sda3 partition mounted to /x01 which I wish to add as a file system resource.

1. I use below command to add a file system as a resource:

# ccs -h 172.16.1.250 --addresource fs name=my_fs device=/dev/mapper/tecminttest_lv_vol01 mountpoint=/x01 fstype=ext3
Add Filesystem to Cluster
Add Filesystem to Cluster

Additionally, if you want to add a service also, you can by using below methodology. Issue the following command.

# ccs -h 172.16.1.250 --addservice my_web domain=testdomain recovery=relocate autostart=1

You can verify it by viewing the cluster.conf file as we did in previous lessons.

2. Now enter following entry in cluster.conf file to add a reference tag to the service.

<fs ref="my_fs"/>
Add Service to Cluster
Add Service to Cluster

3. All set. No we will see how we can sync the configurations we made to cluster among the 2 nodes we have. Following command will do the needful.

# ccs -h 172.16.1.250 --sync --activate
Sync Cluster Configuration
Sync Cluster Configuration

Note: Enter passwords we set for ricci in the early stages when we were installing packages.

You can verify your configurations by using below command.

# ccs -h 172.16.1.250 --checkconf
Verify Cluster Configuration
Verify Cluster Configuration

4. Now it’s time to start the things up. You can use one of below commands as you prefer.

To start only one node use the command with relevant IP.

# ccs -h 172.16.1.222 start

Or if you want to start all nodes use --startall option as follows.

# ccs -h 172.16.1.250 –startall

You can use stop or --stopall if you needed to stop the cluster.

In a scenario like if you wanted to start the cluster without enabling the resources (resources will automatically be enabled when the cluster is started), like a situation where you have intentionally disabled the resources in a particular node in order to disable fencing loops, you don’t want to enable those resources when the cluster is starting.

For that purpose you can use below command which starts the cluster but does not enable the resources.

# ccs -h 172.16.1.250 --startall --noenable 

5. After the cluster has been started up, you can view the stats by issuing clustat command.

# clustat
Check Cluster Status
Check Cluster Status

Above output says there are two nodes in the cluster and both are up and running at the moment.

6. You can remember we have added a failover mechanism in our previous lessons. Want to check it works? This is how you do it. Force shutdown one node and look for cluster stats using clustat command for the results of failover.

I have shut down my node02server(172.16.1.223) using shutdown -h now command. Then executed clustat command from my cluster_server(172.16.1.250).

Check Cluster FailOver
Check Cluster FailOver

Above output clarifies you that node 1 is online while node 2 has gone offline as we shut it down. Yet service and the file system we shared are still online as you can see if you check it on node01 which is online.

# df -h /x01
Verify Cluster Node
Verify Cluster Node

Refer the cluster.conf file with whole config set relevant to our setup used for tecmint.

<?xml version="1.0"?>
<cluster config_version="15" name="tecmint_cluster">
        <fence_daemon post_join_delay="10"/>
        <clusternodes>
                <clusternode name="172.16.1.222" nodeid="1">
                        <fence>
                                <method name="Method01">
                                        <device name="tecmintfence"/>
                                </method>
                        </fence>
                </clusternode>
                <clusternode name="172.16.1.223" nodeid="2">
                        <fence>
                                <method name="Method01">
                                        <device name="tecmintfence"/>
                                </method>
                        </fence>
                </clusternode>
        </clusternodes>
        <cman/>
        <fencedevices>
                <fencedevice agent="fence_virt" name="tecmintfence"/>
        </fencedevices>
        <rm>
                <failoverdomains>
                        <failoverdomain name="tecmintfod" nofailback="0" ordered="1" restricted="0">
                                <failoverdomainnode name="172.16.1.222" priority="1"/>
                                <failoverdomainnode name="172.16.1.223" priority="2"/>
                        </failoverdomain>
                </failoverdomains>
                <resources>
                        <fs device="/dev/mapper/tecminttest_lv_vol01" fstype="ext3" mountpoint="/x01" name="my_fs"/>
                </resources>
                <service autostart="1" domain="testdomain" name="my_web" recovery="relocate"/>
                <fs ref="my_fs"/>
       </rm>
</cluster>

Hope you’ll enjoyed the whole series of clustering lessons. Keep in touch with tecmint for more handy guides everyday and feel free to comment your ideas and queries.

If you read this far, tweet to the author to show them you care. Tweet a thanks
Thilina Uvindasiri
I'm a BSc Special (Hons) graduate in Information Technology and works as an Engineer-Systems. Love to work, explore and research Linux and play rugby as a hobby.

Each tutorial at TecMint is created by a team of experienced Linux system administrators so that it meets our high-quality standards.

Join the TecMint Weekly Newsletter (More Than 156,129 Linux Enthusiasts Have Subscribed)
Was this article helpful? Please add a comment or buy me a coffee to show your appreciation.

56 thoughts on “How to Sync Cluster Configuration and Verify Failover Setup in Nodes – Part 4”

  1. I have run the command:

    # ccs -h 192.168.1.122 --addresource fs name=my_fs device=/dev/mapper/vg_clusterserver-lv_home mountpoint=/home fstype=ext4

    Error:
    Validation Failure, unable to modify configuration file (use -i to ignore this error).

    Please give the solution?

    Reply
    • Yes this point failure is applicable for any clustered or fail over environment. We can failover the failover env, but it is stupid since we can ask the same question again infinity times

      Reply
  2. Hello,

    I’ve completed all 3 parts successfully on CentOS 6.3 32 bit OS, but can’t justify 4th Part. I am configuring apache in failover mode. Below is the error I am having since more than 2 hours.

    # ccs -h 192.168.45.105 --addservice webservice1 domain=linuxboxfod recovery=relocate autostart=1

    Validation Failure, unable to modify configuration file (use -i to ignore this error).

    Reply
    • Try to set passwd ricci for every node instead of just cluster node and also do not use the reference tag as mentioned () kindly do not try to add it.

      I hope the issue will get resolved as mine.

      Reply
  3. Hi Thilina,
    can you share your /etc/hosts file of your management node?

    When i run the command clustat it shows only one node active.
    My /etc/hosts is:
    [root@master cluster]# cat /etc/hosts
    127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4
    ::1 localhost localhost.localdomain localhost6 localhost6.localdomain6
    10.91.18.143 node1.uic.com node1.uic.com
    10.91.18.144 node2.uic.com node1.uic.com
    10.91.18.145 master.uic.com node1.uic.com

    If i dont add node1.uic.com as alias for 144 & 145 then cman service doesnt start.

    145 is my master node. Can you help me in this?

    Reply
    • I’m sorry i’m not able to share my hosts file at the moment. But I guess you can check firewall rules, iptables and security enabling configs to make sure your nodes can talk to each other.

      Reply
    • Yes this point failure is applicable for any clustered or fail over environment. We can failover the failover env, but it is stupid since we can ask the same question again infinity times.

      Reply
  4. During the add resource command I’m getting error like Validate failed. But I checked each and every syntax all are fine. Please give me a suggestion how to resolve the error used (ccs -h 192.168.1.9 –addresource fs name=my_fs device=/dev/sdb1 mountpoint=/cluster fstype=ext4)

    Reply
  5. Hello,

    First of all, great article.

    I am having trouble in one step .. as soon as run “ccs -h manager.example.com –sync –activate”, I get the error “Cluster.conf file specified is not a valid cluster.conf file (use -i to ignore this error)”.

    Any suggestion?

    Reply
    • I tried everything again but this time all the things worked fine except for “clustat” command .. I got “Could not connect to CMAN: No such file or directory” .. Do I need to start cman on my cluster server ?

      Reply
        • Getting this error while starting cman at master:

          service cman start
          Starting cluster:
          Checking if cluster has been disabled at boot… [ OK ]
          Checking Network Manager… [ OK ]
          Global setup… [ OK ]
          Loading kernel modules… [ OK ]
          Mounting configfs… [ OK ]
          Starting cman… Cannot find node name in cluster.conf
          Unable to get the configuration
          Cannot find node name in cluster.conf
          cman_tool: corosync daemon didn’t start Check cluster logs for details
          [FAILED]
          Stopping cluster:
          Leaving fence domain… [ OK ]
          Stopping gfs_controld… [ OK ]
          Stopping dlm_controld… [ OK ]
          Stopping fenced… [ OK ]
          Stopping cman… [ OK ]
          Unloading kernel modules… [ OK ]
          Unmounting configfs… [ OK ]

          Here is the sample Cluster.conf

          Topology is Node 1 :node1.uic.com
          Node2 :node2.uic.com
          Master Server :master.uic.com.Iam running luci onmaster server and cman is running fine on node 1 and 2

          Reply

Got something to say? Join the discussion.

Thank you for taking the time to share your thoughts with us. We appreciate your decision to leave a comment and value your contribution to the discussion. It's important to note that we moderate all comments in accordance with our comment policy to ensure a respectful and constructive conversation.

Rest assured that your email address will remain private and will not be published or shared with anyone. We prioritize the privacy and security of our users.