How to Install and Configure Apache Hadoop on a Single Node in CentOS 7

Step 3: Configure Hadoop in CentOS 7

10. Now it’s time to setup Hadoop cluster on a single node in a pseudo distributed mode by editing its configuration files.

The location of hadoop configuration files is $HADOOP_HOME/etc/hadoop/, which is represented in this tutorial by hadoop account home directory (/opt/hadoop/) path.

Once you’re logged in with user hadoop you can start editing the following configuration file.

The first to edit is core-site.xml file. This file contains information about the port number used by Hadoop instance, file system allocated memory, data store memory limit and the size of Read/Write buffers.

$ vi etc/hadoop/core-site.xml

Add the following properties between <configuration> ... </configuration> tags. Use localhost or your machine FQDN for hadoop instance.

<property>
    <name>fs.defaultFS</name>
    <value>hdfs://master.hadoop.lan:9000/</value>
</property>
Configure Hadoop Cluster
Configure Hadoop Cluster

11. Next open and edit hdfs-site.xml file. The file contains information about the value of replication data, namenode path and datanode path for local file systems.

$ vi etc/hadoop/hdfs-site.xml

Here add the following properties between <configuration> ... </configuration> tags. On this guide we’ll use /opt/volume/ directory to store our hadoop file system.

Replace the dfs.data.dir and dfs.name.dir values accordingly.

<property>
    <name>dfs.data.dir</name>
    <value>file:///opt/volume/datanode</value>
  </property>

  <property>
    <name>dfs.name.dir</name>
    <value>file:///opt/volume/namenode</value>
</property>
Configure Hadoop Storage
Configure Hadoop Storage

12. Because we’ve specified /op/volume/ as our hadoop file system storage, we need to create those two directories (datanode and namenode) from root account and grant all permissions to hadoop account by executing the below commands.

$ su root
# mkdir -p /opt/volume/namenode
# mkdir -p /opt/volume/datanode
# chown -R hadoop:hadoop /opt/volume/
# ls -al /opt/  #Verify permissions
# exit  #Exit root account to turn back to hadoop user
Configure Hadoop System Storage
Configure Hadoop System Storage

13. Next, create the mapred-site.xml file to specify that we are using yarn MapReduce framework.

$ vi etc/hadoop/mapred-site.xml

Add the following excerpt to mapred-site.xml file:

<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>

<configuration>
  <property>
    <name>mapreduce.framework.name</name>
    <value>yarn</value>
  </property>
</configuration>
Set Yarn MapReduce Framework
Set Yarn MapReduce Framework

14. Now, edit yarn-site.xml file with the below statements enclosed between <configuration> ... </configuration> tags:

$ vi etc/hadoop/yarn-site.xml

Add the following excerpt to yarn-site.xml file:

<property>
    <name>yarn.nodemanager.aux-services</name>
    <value>mapreduce_shuffle</value>
</property>
Add Yarn Configuration
Add Yarn Configuration

15. Finally, set Java home variable for Hadoop environment by editing the below line from hadoop-env.sh file.

$ vi etc/hadoop/hadoop-env.sh

Edit the following line to point to your Java system path.

export JAVA_HOME=/usr/java/default/
Set Java Home Variable for Hadoop
Set Java Home Variable for Hadoop

16. Also, replace the localhost value from slaves file to point to your machine hostname set up at the beginning of this tutorial.

$ vi etc/hadoop/slaves
If you liked this article, then do subscribe to email alerts for Linux tutorials. If you have any questions or doubts? do ask for help in the comments section.

If You Appreciate What We Do Here On TecMint, You Should Consider:

TecMint is the fastest growing and most trusted community site for any kind of Linux Articles, Guides and Books on the web. Millions of people visit TecMint! to search or browse the thousands of published articles available FREELY to all.

If you like what you are reading, please consider buying us a coffee ( or 2 ) as a token of appreciation.

Support Us

We are thankful for your never ending support.

46 thoughts on “How to Install and Configure Apache Hadoop on a Single Node in CentOS 7”

  1. Hi everybody

    I install the Hadoop but now i try to upload file a get error Permission denied: user=dr.who, access=WRITE, inode=”/”:hadoop:supergroup:drwxr-xr-x

    Reply
  2. Following this guide while doing this step – ssh-copy-id master.

    I am getting errors:

    /bin/ssh-copy-id: INFO: attempting to log in with the new key(s), to filter out any that are already installed
    /bin/ssh-copy-id: INFO: 1 key(s) remain to be installed — if you are prompted now it is to install the new keys
    ##################################################################################
    NOTE: You have accessed a system owned by FIWARE Lab. You must have authorisation
    before using it, and your use will be strictly limited to that indicated in the
    authorisation.
    Unauthorised access to this system or improper use of the same is prohibited and
    is against the FIWARE Terms & Conditions Policy and the legislation in force. The
    use of this system may be monitored.
    #################################################################################

    Permission denied (publickey,gssapi-keyex,gssapi-with-mic).

    Please help me out.

    Reply
  3. I am not able to start yarn, can please help me out here:

    [[email protected] ~]$ start-yarn.sh
    
    starting yarn daemons
    starting resourcemanager, logging to /opt/hadoop/logs/yarn-hadoop-resourcemanager-master.out
    [email protected]'s password: [email protected]'s password: 
    192.168.25.129: starting nodemanager, logging to /opt/hadoop/logs/yarn-hadoop-nodemanager-master.out
    
    
    [email protected]'s password: master.hadoop.com: Permission denied, please try again.
    
    [email protected]'s password: master.hadoop.com: Permission denied, please try again.
    
    master.hadoop.com: Permission denied (publickey,gssapi-keyex,gssapi-with-mic,password).
    
    Reply

Got something to say? Join the discussion.

Have a question or suggestion? Please leave a comment to start the discussion. Please keep in mind that all comments are moderated and your email address will NOT be published.

This site uses Akismet to reduce spam. Learn how your comment data is processed.