How to Install and Configure Cloudera Manager on CentOS/RHEL 7 – Part 3

In this article, we described the step by step process to install Cloudera Manager as per industrial practices. In Part 2, we already have gone through the Cloudera Pre-requisites, make sure all the servers are prepared perfectly.

Requirements

Here we are going to have 5 node cluster where 2 masters and 3 workers. I have used 5 AWS EC2 instances to demonstrate the installation procedure. I have named those 5 servers as below.

master1.tecmint.com
master2.tecmint.com
worker1.tecmint.com
worker2.tecmint.com
worker3.tecmint.com

Cloudera Manager is an administrative and monitoring tool for the entire CDH. We admin usually calling it a management tool for Cloudera Hadoop. We can deploy, monitor, control, and make configuration changes with the use of this tool. This is very much essential to manage the entire cluster.

Below are the important uses of Cloudera Manager.

  • Deploy and configure Hadoop clusters in an automated way.
  • Monitor cluster health
  • Configure alerts
  • Troubleshooting
  • Reporting
  • Making Cluster Utilization Report
  • Configuring Resources dynamically

Step 1: Installing Apache Web Server on CentOS

We are going to use the master1 as a webserver for Cloudera repositories. Also, Cloudera Manager is WebUI, so we need to have Apache installed. Follow the below steps to install the apache web server.

# yum -y install httpd

Once installed httpd, start it and enable so that it will be started on boot.

# systemctl start httpd
# systemctl enable httpd

After starting httpd, ensure the status.

# systemctl status httpd
Check-Apache-Status
Check-Apache-Status

After starting httpd, open a browser in your local system and paste the IP address of master1 in the search bar, you should get this test page to make sure httpd is running fine.

Check Apache Webpage
Check Apache Webpage

Step 2: Configure Local DNS to Resolve IP and Hostname

We need to have a DNS server or configure /etc/hosts to resolve IP and hostname. Here we are configuring /etc/hosts, but in real-time, a dedicated DNS server will be there for the production environment.

Follow the below steps to make an entry for all your servers in /etc/hosts.

# vi /etc/hosts

This should be configured in all the servers.

13.235.27.144   master1.tecmint.com     master1
13.235.135.170  master2.tecmint.com     master2
15.206.167.94   worker1.tecmint.com     worker1
13.232.173.158  worker2.tecmint.com     worker2
65.0.182.222    worker3.tecmint.com     worker3
Configure Local DNS
Configure Local DNS

Step 3: Configure SSH Passwordless Login

Cloudera Manager is being installed on master1 in this demonstration. We need to configure password-less ssh from master1 to all other nodes. Because the Cloudera Manager will use ssh to communicate all other nodes to install packages.

Follow the below steps to configure password-less ssh from master1 to all remaining servers. We are going to have a user ‘tecmint‘ to proceed further.

Create a user ‘tecmint‘ all 4 servers using useradd command as shown.

# useradd -m tecmint

To give the root privilege to the user ‘tecmint‘, add the below line into /etc/sudoers file. You can add this line under root as give in the screenshot.

tecmint   ALL=(ALL)    ALL
Add User to Sudo
Add User to Sudo

Switch to user ‘tecmint‘ and create ssh key in the master1 using the below command.

# sudo su tecmint
$ ssh-keygen

Now copy the created key to all 4 servers by using the ssh-copy-id command as shown.

$ ssh-copy-id -i ~/.ssh/id_rsa.pub [email protected]
$ ssh-copy-id -i ~/.ssh/id_rsa.pub [email protected]
$ ssh-copy-id -i ~/.ssh/id_rsa.pub [email protected] 
$ ssh-copy-id -i ~/.ssh/id_rsa.pub [email protected]

Now you should be able to ssh from master1 to remaining all servers without a password as shown.

$ ssh master2
$ ssh worker1
$ ssh worker2
$ ssh worker3
SSH Passwordless Login Servers
SSH Passwordless Login Servers

Step 4: Installing and Configuring Cloudera Manager

We can use the vendor (Cloudera) repository to install all the packages using the package management tools in RHEL/CentOS. In real-time, creating our own repository is the best practice because we may not be having internet access in the production servers.

Here we are going to install Cloudera Manager 6.3.1 release. Since we are going to use master1 as the repo server, we are downloading the packages in the below-mentioned path.

Create the below-mentioned directories on master1 server.

$ sudo mkdir -p /var/www/html/cloudera-repos/cm6

We can use the wget tool to download packages over http. So, install wget using the below command.

$ sudo yum -y install wget

Next, download the Cloudera Manager tar file using the following wget command.

$ wget https://archive.cloudera.com/cm6/6.3.1/repo-as-tarball/cm6.3.1-redhat7.tar.gz

Extract the tar file into /var/www/html/cloudera-repos/cm6, already we have made master1 as webserver by installing http and we have tested on the browser.

$ sudo tar xvfz cm6.3.1-redhat7.tar.gz -C /var/www/html/cloudera-repos/cm6 --strip-components=1

Now, verify that all the Cloudera rpm files are there in /var/www/html/cloudera-repos/cm6/RPMS/x86_64 directory.

$ cd /var/www/html/cloudera-repos/cm6
$ ll
List Cloudera Files
List Cloudera Files

Create /etc/yum.repos.d/cloudera-manager.repo files on all servers in the cluster hosts with the following content, here master1 (65.0.101.148) is the Web server.

[cloudera-repo]
name=cloudera-manager
baseurl=http:///cloudera-repos/cm6/
enabled=1
gpgcheck=0

Now the repository has been added, run the below command to view the enabled repositories.

$ yum repolist
List Cloudera Repository
List Cloudera Repository

Run the below command to view all the available Cloudera related packages in the repository.

$ yum list available | grep cloudera*
List Cloudera Packages
List Cloudera Packages

Install cloudera-manager-server, cloudera-manager-agent, cloudera-manager-daemons cloudera-manager-server-db-2.

$ sudo yum install cloudera-manager-daemons cloudera-manager-agent cloudera-manager-server cloudera-manager-server-db-2
Install Cloudera in CentOS
Install Cloudera in CentOS

Run the below command to view all the installed Cloudera packages.

$ yum list installed | grep cloudera*
List Installed Cloudera Packages
List Installed Cloudera Packages

Run the below command to start the cloudera-scm-server-db which is an underlying database to store Cloudera Manager and other services metadata.

By default, Cloudera is coming up with postgre-sql which is embedded in the Cloudera Manager. We are installing the embedded one, in a real-time external database that can be used. It can be Oracle, MySQL, or PostgreSQL.

$ sudo systemctl start cloudera-scm-server-db

Run the below command to check the status of the database.

$ sudo systemctl status cloudera-scm-server-db
Check Cloudera Databasse Status
Check Cloudera Database Status

Configure the db.properties for the Cloudera Manager server.

$ vi /etc/cloudera-scm-server/db.properties

Configure the below value is EMBEDDED to make Cloudera Manager use the Embedded Database.

com.cloudera.cmf.db.setupType=EMBEDDED
Configure Cloudera Database Properties
Configure Cloudera Database Properties

Run the below command to start the Cloudera Manager server.

$ sudo systemctl start cloudera-scm-server

Run the below command to check the status of the Cloudera Manager server.

$ sudo systemctl status cloudera-scm-server
Check Cloudera Manager Status
Check Cloudera Manager Status

Run the below command to start and check the status of the Cloudera Manager agent.

$ sudo systemctl start cloudera-scm-agent
$ sudo systemctl status cloudera-scm-agent
Check Cloudera Manager Agent Status
Check Cloudera Manager Agent Status

Once the Cloudera Manager Server successfully up and running fine, you can view the WebUI (Login page) in the browser using IP address and port number 7180 which is the port number of Cloudera Manager.

https://65.0.101.148:7180
Cloudera Manager Login
Cloudera Manager Login
Summary

In this article, we have seen step by step process for installing Cloudera Manager on CentOS 7. We will see the CDH and other service installations in the next article.

Mohan Sivam
A Solution Architect and Bigdata infrastructure with over 10 years of experience in Information technology.

Each tutorial at TecMint is created by a team of experienced Linux system administrators so that it meets our high-quality standards.

Join the TecMint Weekly Newsletter (More Than 156,129 Linux Enthusiasts Have Subscribed)
Was this article helpful? Please add a comment or buy me a coffee to show your appreciation.

1 Comment

Leave a Reply

Leave a Reply to kunaspandana Cancel reply

Thank you for taking the time to share your thoughts with us. We appreciate your decision to leave a comment and value your contribution to the discussion. It's important to note that we moderate all comments in accordance with our comment policy to ensure a respectful and constructive conversation.

Rest assured that your email address will remain private and will not be published or shared with anyone. We prioritize the privacy and security of our users.