YARN is the Processing Layer of Hadoop, which consists of the Master (Resource Manager) and Slave (Node Manager) services to process the data. Resource Manager (RM) is the critical component that is responsible for resource allocation and management among all the jobs running in Hadoop Cluster.
It is always recommended and best practice to have the Cluster High Availability (HA) enabled on Critical services like Namenode and Resource Manager.
- Best Practices for Deploying Hadoop Server on CentOS/RHEL 7 – Part 1
- Setting Up Hadoop Pre-requisites and Security Hardening – Part 2
- How to Install and Configure the Cloudera Manager on CentOS/RHEL 7 – Part 3
- How to Install CDH and Configure Service Placements on CentOS/RHEL 7 – Part 4
- How to Set Up High Availability for Namenode – Part 5
In this article, we will see the steps to enable High Availability on Resource Manager.
Enabling High Availability on Resource Manager
1. Go to Cloudera Manager at the following addresses and navigate to YARN –> Actions –> Enable High Availability.
2. Select a server where you are going to have a second Resource Manager. Usually, we will be having a second master server to deploy High Availability. Here, we are selecting master2 for enabling HA.
3. Once selected the master2, click ‘Continue’ to proceed.
4. Enabling HA process will be started. You can view the background operations by clicking each steps.
5. Once all the processes completed, you will be getting ‘Finished’ status. Click ‘Finish’.
6. Verify the Resource Manager High availability by viewing the Yarn instances at Cloudera Manager –> YARN –> Instances.
You can see two Resource Manager, one will be in the state ‘Active’, another will be in ‘Standby’.
In this article, we have gone through the step by step procedure to enable High Availability on Resource Manager. Whenever the Active Resource Manager goes down, the Standby Resource manager will become Active so that production will not be having an outage.