Step 4: Format Hadoop Namenode
17. Once hadoop single node cluster has been setup it’s time to initialize HDFS file system by formatting the /opt/volume/namenode storage directory with the following command:
$ hdfs namenode -format
Step 5: Start and Test Hadoop Cluster
18. The Hadoop commands are located in
$HADOOP_HOME/sbin directory. In order to start Hadoop services run the below commands on your console:
$ start-dfs.sh $ start-yarn.sh
Check the services status with the following command.
Alternatively, you can view a list of all open sockets for Apache Hadoop on your system using the ss command.
$ ss -tul $ ss -tuln # Numerical output
19. To test hadoop file system cluster create a random directory in the HDFS file system and copy a file from local file system to HDFS storage (insert data to HDFS).
$ hdfs dfs -mkdir /my_storage $ hdfs dfs -put LICENSE.txt /my_storage
To view a file content or list a directory inside HDFS file system issue the below commands:
$ hdfs dfs -cat /my_storage/LICENSE.txt $ hdfs dfs -ls /my_storage/
To retrieve data from HDFS to our local file system use the below command:
$ hdfs dfs -get /my_storage/ ./
Get the full list of HDFS command options by issuing:
$ hdfs dfs -help
Step 6: Browse Hadoop Services
20. In order to access Hadoop services from a remote browser visit the following links (replace the IP Address of FQDN accordingly). Also, make sure the below ports are open on your system firewall.
For Hadoop Overview of NameNode service.
For Hadoop file system browsing (Directory Browse).
For Cluster and Apps Information (ResourceManager).
For NodeManager Information.
Step 7: Manage Hadoop Services
21. To stop all hadoop instances run the below commands:
$ stop-yarn.sh $ stop-dfs.sh
22. In order to enable Hadoop daemons system-wide, login with root user, open
/etc/rc.local file for editing and add the below lines:
$ su - root # vi /etc/rc.local
Add these excerpt to rc.local file.
su - hadoop -c "/opt/hadoop/sbin/start-dfs.sh" su - hadoop -c "/opt/hadoop/sbin/start-yarn.sh" exit 0
Then, add executable permissions for
rc.local file and enable, start and check service status by issuing the below commands:
$ chmod +x /etc/rc.d/rc.local $ systemctl enable rc-local $ systemctl start rc-local $ systemctl status rc-local
That’s it! Next time you reboot your machine the Hadoop services will be automatically started for you! All you need to do is to fire-up a Hadoop compatible application and you’re ready to go!