Contribute Limited Amount Of Storage As Slave(DATANODE) To The Hadoop-Cluster

Lalita Sharma
4 min readOct 31, 2020

→ HADOOP CLUSTER ←

As you know, To get rid of BIG-DATA problem , we are using the HADOOP concept. In this concept, A Hadoop Cluster has been set-up with one NameNode , Client and some DataNodes. A CLIENT is the one who always uploads the data and read the data. It directly uploads the file to the datanodes by requesting to master to get the IP Address of the datanodes. It is able to decide the no. of replica to be created. A Master or Slave both can be act as a CLIENT. During file uploaded, all data stored in datanode storage, and now, if we want to limit this storage from datanode , then, how’s it possible? ANSWER — “Partition Concept of hard disk” .

*******************************************************************Let’s discuss the Problem Statement:-

🔷In a Hadoop cluster, find how to contribute limited/specific amount of storage as slave to the cluster?

✴️Hint: Linux partitions

*******************************************************************Step-1: Launch an EC2 instance on AWS with storage 10 GiB

Step-2: Check this storage of root directory “/” by using command:-

# df -h

>>Check the free space using command:-

# free -m

>>Remove Caches to get more free space by using command:-

# echo 3 > /proc/sys/vm/drop_caches

CLI

Step-3: Attach an EBS Volume of size 1GiB to the above launched EC2 instance

>>Check the attached volume by using command:-

# fdisk -l

CLI

Step-4: Create partition of 1GiB storage of Ebs volume attached

Here, I have created the first partition of size +512M by using command:-

# fdisk /dev/xvdf

CLI

Step-5: First partition created named as /dev/xvdf1

Here, the first partition created above can be seen by using command:-

# fdisk -l

CLI

Step-6: Format the HDD /dev/xvdf1

Fomat the first partition created above by using command :-

# mkfs.ext4 /dev/xvdf1

CLI

Step-7: Create a new directory and mount the partition to it

Here, I have created a directory named as /dn1 for mounting the storage by using command:-

# mkdir /dn1

Now, Mount the partition created or storage to this directory by using command:-

# mount /dev/xvdf1 /dn1

CLI

Step-8: Check the mount volume to the /dn1 directory

After mounting storage to the /dn1 folder , we can see size of folder as 488MB by using command:-

# df -h

Now, this folder /dn1 have size of 488MB.

CLI

Step-9: Configure ‘hdfs-site.xml’ file of DataNode

Here, while configuring “hdfs-site.xml” file in datanode, you must take care about the folder (or directory) name mention here, write the same folder name to which you have mount the volume i.e. /dn1.

hdfs-site.xml

Step-10: Start the hadoop daemon service to the NameNode or DataNode and check the DataNode available to the Hadoop-Cluster

CLI

Here, One DataNode is connected to the Hadoop-Cluster and providing storage of 487.95 MB as we have limit it’s storage now.

Step-11: Stop the above EC2 instance using CLI

>>Stop instance by using command:-

# aws ec2 stop-instances — instance-ids <instance id>

Stop the instance using CLI

Thus, As you see above , we have limit the DataNode storage contributing to the Hadoop-Cluster for storing data or files uploaded by the Hadoop-Client.

Hence, Task Completed.

THANKYOU FOR READING ….(*~*)

For more efficient content, Don’t forget to press the clap icon below and follow me on medium!!

--

--

Lalita Sharma

Aeromodeller|Passionate|Technoholic|Learner|Technical writer