Setting a Single Node Gen1 Hadoop Cluster (hadoop-1.2.1)

In this blog, we will learn how to setup a single node Gen1 Cluster. We will be using Hadoop 1.2.1 for this exercise. In this tutorial, I am assuming your present working directory is your home folder. Following is the criteria for installing Apache Hadoop, Lets start,

  • Linux Environment ( Any Linux distribution with kernel version greater than 2.6.x )
  • SSH service
  • Java7

Let’s start,

Part 1 – Pre-requisites setup

Step1: Update the OS

sudo apt-get update

Step2: Install SSH server if not present

sudo apt-get install openssh-server

Step3: Install Java7. Here we will use Oracle Java7

sudo add-apt-repository ppa:webupd8team/java
sudo apt-get update
sudo apt-get install oracle-java7-installer

Part 2 – Installing Apache Hadoop 1.2.1

Step1: Download hadoop-1.2.1 from hadoop.apache.org

wget https://archive.apache.org/dist/hadoop/common/hadoop-1.2.1/hadoop-1.2.1.tar.gz

Step2: Extract the tar file

tar -xvzf hadoop-1.2.1.tar.gz

Step3: Rename the extracted folder.

mv hadoop-1.2.1 hadoop

Step4: Setup environment variables in the system for hadoop framework

vi .bashrc

#Add the following lines in the start of the file
export HADOOP_HOME=/home/hadoop/hadoop
export PATH=$PATH:$HADOOP_HOME/bin
export HADOOP_PREFIX=$HADOOP_HOME

Step5: Update the environment variables in the system

exec bash

Step6: Inform Hadoop where is Java

vi hadoop/conf/hadoop-env.sh

export JAVA_HOME=/usr/lib/jvm/java-7-oracle

Step7: Setup core-site.xml for Namenode and Storage directory

vi hadoop/conf/core-site.xml

<!-- Add the following property within the configuration tag -->

<property>
<name>fs.default.name</name>
<value>hdfs://hadoopvm:8020</value>
<description> Ensure you replace hadoopvm with the IP or hostname of the your machine </description>
</property>

<property>
<name>hadoop.tmp.dir</name>
<value>/home/hadoop/hdfsdrive</value>
</property>

Step8: Create directory for HDFS

mkdir /home/hadoop/hdfsdrive

Step9: Setup hdfs-site.xml for replication factor as 1

vi hadoop/conf/hdfs-site.xml

<!-- Add the following property within the configuration tag -->

<property>
<name>dfs.replication</name>
<value>1</value>
</property>

Step10: Setup mapred-site.xml for setting up JobTracker

vi hadoop/conf/mapred-site.xml

<!-- Add the following property within the configuration tag -->

<property>
<name>mapred.job.tracker</name>
<value>hadoopvm:8021</value>
</property>

Step11: Setup masters file for SecondaryNamenode service

vi hadoop/conf/masters

#Add the IP address or hostname which you want to make it as SecondaryNamenode
hadoopvm

Step12: Setup slaves file for DataNode and TaskTracker

vi hadoop/conf/slaves

#Add the IP address or hostname which you want to make it as SecondaryNamenode
hadoopvm

Step13: Format the namenode and create a filesystem

hadoop namenode -format

Step14: Create a passwordless setup so that the service can start without prompting for the password from the user.

ssh-keygen ( Press Enter till your keys are generated. No need type any data. )
ssh-copy-id -i .ssh/id_rsa.pub hadoop@hadoopvm (assuming hostname of the machine is hadoopvm)

Step15: Start Hadoop service

start-all.sh

Step16: Check whether services are alive or not

jps

Step17: Check whether WebUI is alive or not

On the browser navigate to the following address

http://<IP_ADDRESS_MACHINE>:50070  (NameNode WebUI)
http://<IP_ADDRESS_MACHINE>:50090  (SecondaryNameNode WebUI)
http://<IP_ADDRESS_MACHINE>:50030  (JobTracker WebUI)
http://<IP_ADDRESS_MACHINE>:50060  (TaskTrackerWebUI)
http://<IP_ADDRESS_MACHINE>:50075/browseDirectory.jsp?dir=/  (DataNode WebUI)

Thus now your Hadoop is successfully setup in Pseudo distributed mode. Hope you liked this tutorial.

Prashant Nair

Bigdata Consultant | Author | Corporate Trainer | Technical Reviewer Passionate about new trends and technologies. More Geeky. Contact me for training and consulting !!!

One thought on “Setting a Single Node Gen1 Hadoop Cluster (hadoop-1.2.1)

  1. Hi. Can you please post a tutorial to deploy hbase on top of hadoop and create a table in hbase.

    I am following your blog and find it useful. Thanks for sharing info.

    Regards
    Guru

Leave a Reply

Your email address will not be published. Required fields are marked *