Setting a Single Node Gen1 Hadoop Cluster (hadoop-1.2.1)

In this blog, we will learn how to setup a single node Gen1 Cluster. We will be using Hadoop 1.2.1 for this exercise. In this tutorial, I am assuming your present working directory is your home folder. Following is the criteria for installing Apache Hadoop, Lets start,

  • Linux Environment ( Any Linux distribution with kernel version greater than 2.6.x )
  • SSH service
  • Java7

Let’s start,

Part 1 – Pre-requisites setup

Step1: Update the OS

sudo apt-get update

Step2: Install SSH server if not present

sudo apt-get install openssh-server

Step3: Install Java7. Here we will use Oracle Java7

sudo add-apt-repository ppa:webupd8team/java
sudo apt-get update
sudo apt-get install oracle-java7-installer

Part 2 – Installing Apache Hadoop 1.2.1

Step1: Download hadoop-1.2.1 from


Step2: Extract the tar file

tar -xvzf hadoop-1.2.1.tar.gz

Step3: Rename the extracted folder.

mv hadoop-1.2.1 hadoop

Step4: Setup environment variables in the system for hadoop framework

vi .bashrc

#Add the following lines in the start of the file
export HADOOP_HOME=/home/hadoop/hadoop

Step5: Update the environment variables in the system

exec bash

Step6: Inform Hadoop where is Java

vi hadoop/conf/

export JAVA_HOME=/usr/lib/jvm/java-7-oracle

Step7: Setup core-site.xml for Namenode and Storage directory

vi hadoop/conf/core-site.xml

<!-- Add the following property within the configuration tag -->

<description> Ensure you replace hadoopvm with the IP or hostname of the your machine </description>


Step8: Create directory for HDFS

mkdir /home/hadoop/hdfsdrive

Step9: Setup hdfs-site.xml for replication factor as 1

vi hadoop/conf/hdfs-site.xml

<!-- Add the following property within the configuration tag -->


Step10: Setup mapred-site.xml for setting up JobTracker

vi hadoop/conf/mapred-site.xml

<!-- Add the following property within the configuration tag -->


Step11: Setup masters file for SecondaryNamenode service

vi hadoop/conf/masters

#Add the IP address or hostname which you want to make it as SecondaryNamenode

Step12: Setup slaves file for DataNode and TaskTracker

vi hadoop/conf/slaves

#Add the IP address or hostname which you want to make it as SecondaryNamenode

Step13: Format the namenode and create a filesystem

hadoop namenode -format

Step14: Create a passwordless setup so that the service can start without prompting for the password from the user.

ssh-keygen ( Press Enter till your keys are generated. No need type any data. )
ssh-copy-id -i .ssh/ hadoop@hadoopvm (assuming hostname of the machine is hadoopvm)

Step15: Start Hadoop service

Step16: Check whether services are alive or not


Step17: Check whether WebUI is alive or not

On the browser navigate to the following address

http://<IP_ADDRESS_MACHINE>:50070  (NameNode WebUI)
http://<IP_ADDRESS_MACHINE>:50090  (SecondaryNameNode WebUI)
http://<IP_ADDRESS_MACHINE>:50030  (JobTracker WebUI)
http://<IP_ADDRESS_MACHINE>:50060  (TaskTrackerWebUI)
http://<IP_ADDRESS_MACHINE>:50075/browseDirectory.jsp?dir=/  (DataNode WebUI)

Thus now your Hadoop is successfully setup in Pseudo distributed mode. Hope you liked this tutorial.

Prashant Nair

Bigdata Consultant | Author | Corporate Trainer | Technical Reviewer Passionate about new trends and technologies. More Geeky. Contact me for training and consulting !!!

One thought on “Setting a Single Node Gen1 Hadoop Cluster (hadoop-1.2.1)

  1. Hi. Can you please post a tutorial to deploy hbase on top of hadoop and create a table in hbase.

    I am following your blog and find it useful. Thanks for sharing info.


Leave a Reply

Your email address will not be published. Required fields are marked *