Big Data Analytics:
Hadoop HDFS (Part II)
Dr. Shivangi Shukla
Assistant Professor
Computer Science and Engineering
IIIT Pune
Data Integrity
• Blocks of data retrieved from DataNode can be
corrupted,
• this corruption can occur because of faults in a
storage device, network faults, or buggy software
• To provide data integrity in HDFS,
• client software implements checksum checking on
the contents of files
• When a client creates an HDFS file,
• client computes a checksum of each block of the file
and stores these checksums in a separate hidden
file in the same HDFS namespace
Big Data Analytics: Hadoop 2
Data Integrity..
• When a client retrieves file contents,
• client verifies that the data it received from each
DataNode matches the checksum stored in the
associated checksum file
• If not, then the client can opt to retrieve that block
from another DataNode that has a replica of that
block
Big Data Analytics: Hadoop 3
Metadata Disk Failure
• FsImage and the EditLog are central data structures of
HDFS
• corruption of these files can cause the HDFS
instance to be non-functional
• NameNode can be configured to support maintaining
multiple copies of the FsImage and EditLog
• Any update to either the FsImage or EditLog causes
each of the FsImages and EditLog to get updated
synchronously
• this synchronous updating of multiple copies of the
FsImage and EditLog may degrade the rate of
namespace transactions per second that NameNode
can support Big Data Analytics: Hadoop 4
Metadata Disk Failure..
• The degradation of namespace transactions per second is
acceptable
• because HDFS applications are very data intensive in
nature, but they are not metadata intensive
• When a NameNode restarts,
• it selects the latest consistent FsImage and EditLog to
use
• The NameNode machine is a single point of failure for
HDFS cluster
• if NameNode machine fails, manual intervention is
necessary
• currently, automatic restart and failover of the
NameNode software to another machine is not
supported Big Data Analytics: Hadoop 5
Snapshots
• Snapshots support storing a copy of data at a particular
instant of time
• One usage of the snapshot feature may be to roll back a
corrupted HDFS instance to a previously known good
point in time.
Big Data Analytics: Hadoop 6
Data Organization: Blocks
• HDFS is designed to support very large files.
• Applications that are compatible with HDFS are those
that deal with large data sets.
• These applications write their data only once but
they read it one or more times and require these
reads to be satisfied at streaming speeds.
• HDFS supports write-once-read-many semantics on
files.
• A typical block size used by HDFS is 64 MB.
• Thus, an HDFS file is chopped up into 64 MB chunks,
and if possible, each chunk will reside on a different
DataNode. Big Data Analytics: Hadoop 7
Data Organization: Staging
• A client request to create a file does not reach the NameNode
immediately.
▪ Initially the HDFS client caches the file data into a temporary
local file. Application writes are transparently redirected to
this temporary local file.
▪ When the local file accumulates data worth over one HDFS
block size, the client contacts NameNode.
▪ NameNode inserts the file name into the file system
hierarchy and allocates a data block for it.
o NameNode responds to the client request with the
identity of the DataNode and the destination data block.
o Then the client flushes the block of data from the local
temporary file to the specified DataNode.
Big Data Analytics: Hadoop 8
Data Organization: Staging
• When a file is closed, the remaining un-flushed data in the
temporary local file is transferred to the DataNode.
• The client then tells the NameNode that the file is closed. At
this point, the NameNode commits the file creation
operation into a persistent store. If the NameNode dies
before the file is closed, the file is lost.
• The client writes to temporary local file and not to remote
file directly
• because network speed and congestion in the network
impacts throughput
Big Data Analytics: Hadoop 9
Data Organization: Replication
Pipelining
• When a client is writing data to an HDFS file, initially its
data is first written to a local file
• Assume the HDFS has a replication factor of three for
any file
➢When the local file accumulates a full block of user
data, the client retrieves a list of DataNodes from
the NameNode
oThis list contains DataNodes that will host a
replica of that block
➢The client then flushes the data block to the first
DataNode
Big Data Analytics: Hadoop 10
Data Organization: Replication
Pipelining..
➢The first DataNode starts receiving the data in small
portions (4 KB), writes each portion to its local
repository and transfers that portion to the second
DataNode in the list.
➢The second DataNode, in turn starts receiving each
portion of the data block, writes that portion to its
repository and then flushes that portion to the third
DataNode.
➢Finally, the third DataNode writes the data to its local
repository.
• DataNode can be receiving data from the previous one in
the pipeline and at the same time forwarding data to the
next one in the pipeline. Thus, the data is pipelined from
one DataNode to the next Big Data Analytics: Hadoop 11
Space Reclamation: File Deletion
• When a file is deleted by a user or an application, it is not
immediately removed from HDFS
• Instead, HDFS first renames it to a file in
the /trash directory.
• The file can be restored quickly as long as it remains
in /trash
• A file remains in /trash for a configurable amount of
time
• After the expiry of its life in /trash, NameNode deletes
the file from the HDFS namespace
• The deletion of a file causes the blocks associated with
the file to be freed
12
Big Data Analytics: Hadoop
Space Reclamation: File Deletion
• It should be noted that there could be an
appreciable time delay between the time a file is
deleted by a user and the time of the
corresponding increase in free space in HDFS.
Big Data Analytics: Hadoop 13
Space Reclamation: File Un-Deletion
• User can undelete a file after deleting it as long as it remains
in the /trash directory.
• If a user wants to undelete a file that he/she has deleted,
then user can navigate the /trash directory and retrieve
the file.
• The /trash directory contains only the latest copy of
the file that was deleted.
• The /trash directory is just like any other directory
with one special feature: HDFS applies specified policies
to automatically delete files from this directory.
• The current default policy is to delete files
from /trash that are more than six hours old.
• In the future, this policy will be configurable through a
well defined interface. 14
Big Data Analytics: Hadoop
HDFS Read/ Write Architecture
• HDFS follows Write Once-Read Many Philosophy.
• Users can’t edit files already stored in HDFS.
• However, users can append new data by re-
opening the file
Big Data Analytics: Hadoop 15
HDFS Write Architecture
• Assume HDFS client, wants to write a file named
“example.txt” of size 248 MB
• Assume block size is configured for 128 MB
(default)
• Client will be dividing the file “example.txt” into
two blocks – one of 128 MB (Block A) and the other
of 120 MB (block B)
Big Data Analytics: Hadoop 16
HDFS Write Architecture..
• The following protocol is followed whenever the data is
written into HDFS:
i. Client will reach out to the NameNode for a Write
Request against the two blocks, say, Block A &
Block B (two blocks of file “example.txt”)
ii. NameNode grants the client the write permission
and provide the IP addresses of the DataNodes
where the file blocks has to be copied eventually
❑this selection of IP addresses of DataNodes is
purely randomized based on availability,
replication factor and rack awareness
Big Data Analytics: Hadoop 17
HDFS Write Architecture..
iii. Assuming that replication factor is set to default i.e.
three. Therefore, for each block the NameNode will
be providing the client a list of three IP addresses of
DataNodes. It should be noted that this list will be
unique for each block.
iv. Assume that the NameNode provides following lists
of IP addresses to the client:
❑For Block A, list A = {IP of DataNode 1, IP of
DataNode 4, IP of DataNode 6}
❑For Block B, list B = {IP of DataNode 7, IP of
DataNode 9, IP of DataNode 3}
Big Data Analytics: Hadoop 18
HDFS Write Architecture..
v. Each block will be copied in three different
DataNodes to maintain the replication factor
consistent throughout the cluster.
vi. Now the whole data copy process will happen in
three stages:
A. Set up of Pipeline
B. Data streaming and replication
C. Shutdown of Pipeline (Acknowledgement
stage)
Big Data Analytics: Hadoop 19
HDFS Write Architecture..
A. Set up of Pipeline
• Before writing the blocks, the client confirms whether
the DataNodes, present in each of the list of IPs, are
ready to receive the data or not.
▪ If ready, then the client creates a pipeline for each
of the blocks by connecting the individual
DataNodes in the respective list for that block.
▪ Suppose for Block A, the list of DataNodes provided
by the NameNode is as follows:
▪ For Block A, list A = {IP of DataNode 1, IP of
DataNode 4, IP of DataNode 6}
Big Data Analytics: Hadoop 20
HDFS Write Architecture..
A. Set up of Pipeline
• For block A, the client has to perform the following steps to
create a pipeline:
1) The client chooses the first DataNode in the list (DataNode
IPs for Block A) which is DataNode 1 and establishes a
TCP/IP connection.
2) The client informs DataNode 1 to be ready to receive the
block. It will also provide the IPs of next two DataNodes (4
and 6) to the DataNode 1 where the block is supposed to
be replicated.
3) The DataNode 1 connects to DataNode 4. The DataNode 1
inform DataNode 4 to be ready to receive the block and
provides the IP of DataNode 6. Then, DataNode 4 informs
DataNode 6 to be ready for receiving the data. 21
Big Data Analytics: Hadoop
HDFS Write Architecture..
A. Set up of Pipeline
4) Next, the acknowledgement of readiness follow the
reverse sequence,
❑i.e. the acknowledgement flows from the DataNode
6 to DataNode 4 and then to DataNode 1.
5) At last, DataNode 1 informs the client that all the
DataNodes are ready and a pipeline is formed
between the client, DataNode 1, DataNode 4 and
DataNode 6.
Now pipeline set up is complete and the client will finally
begin the data copy or streaming process.
Big Data Analytics: Hadoop 22
HDFS Write Architecture..
Fig 6: Setting up of Write Pipeline in HDFS
23
Big Data Analytics: Hadoop
HDFS Write Architecture..
B. Data Streaming and Replication
• Once the pipeline has been created, the client pushes
the data into the pipeline.
• Data is replicated based on replication factor in HDFS.
• Thereby, as per our example, Block A will be stored to
three DataNodes as the assumed replication factor is
three.
• The client will copy the block A to DataNode 1 only.
• The replication is always done by DataNodes
sequentially.
Big Data Analytics: Hadoop 24
HDFS Write Architecture..
B. Data Streaming and Replication
The following steps are executed during replication:
i. Once the block has been written to DataNode 1
by the client, DataNode 1 connects to DataNode
4.
ii. Then, DataNode 1 pushes the block in the
pipeline and data is copied to DataNode 4.
iii. Again, DataNode 4 connects to DataNode 6 and
DataNode 6 copies the last replica of the block.
Big Data Analytics: Hadoop 25
HDFS Write Architecture..
Fig 7: HDFS Write operation for Block A
26
Big Data Analytics: Hadoop
HDFS Write Architecture..
C. Shutdown of Pipeline or Acknowledgement Stage
• Once the block has been copied into all the three
DataNodes,
• series of acknowledgements takes place to ensure the
client and NameNode that the data has been written
successfully
• Finally, the client closes the pipeline to end the TCP
session
• As per previous example, the acknowledgement
follows in the reverse sequence i.e. from DataNode 6
to DataNode 4 and then to DataNode 1
• Finally, the DataNode 1 pushes three
acknowledgements (including its own) into the pipeline
and send it to the client
• The client informs NameNode that data has been
written successfully. The NameNode updates its
metadata and the client will shut down the pipeline
Big Data Analytics: Hadoop 27
HDFS Write Architecture..
Fig 8: Acknowledgment Stage for Block A in HDFS
Big Data Analytics: Hadoop 28
HDFS Write Architecture..
• As per previous example (“example.txt” into two blocks –
Block A and Block B),
• Block B will also be copied into the DataNodes in parallel
with Block A. So,
• The following points are to be noted here:
➢The client will copy Block A and Block B to the first
DataNode simultaneously.
➢Therefore, in our example, two pipelines will be
formed for each of the block and all the process will
happen in parallel in these two pipelines.
➢The client writes the block into the first DataNode
and then the DataNodes will be replicating the block
sequentially.
Big Data Analytics: Hadoop 29
HDFS Write Architecture..
List for Block A =
{DataNode 1,
DataNode 4,
DataNode 6}
List for Block B,
list B = {DataNode
7, DataNode 9,
DataNode 3}
The two pipelines are
•Block A: 1A -> 2A
-> 3A -> 4A
•Block B: 1B -> 2B
-> 3B -> 4B -> 5B
-> 6B
Fig 9: Multi-Block Write Operation for Block A and Block B in HDFS
Big Data Analytics: Hadoop 30
HDFS Read Architecture
• Assume HDFS client, wants to write a file named
“example.txt” of size 248 MB
• Assume block size is configured for 128 MB
(default)
• Client will be dividing the file “example.txt” into
two blocks – one of 128 MB (Block A) and the other
of 120 MB (block B)
Big Data Analytics: Hadoop 31
HDFS Read Architecture
The following steps are executed to read the file:
• The client will reach out to NameNode asking for the
block metadata for the file “example.txt”.
• The NameNode will return the list of DataNodes where
each block (Block A and B) are stored.
• After that client, will connect to the DataNodes where
the blocks are stored.
• The client starts reading data parallel from the
DataNodes (Block A from DataNode 1 and Block B from
DataNode 3).
• Once the client gets all the required file blocks, it will
combine these blocks to form a file.
Big Data Analytics: Hadoop 32
HDFS Read Architecture..
• While serving read request of the client, HDFS
selects the replica which is closest to the client.
• This reduces the read latency and the
bandwidth consumption.
• Therefore, that replica is selected which
resides on the same rack as the reader node,
if possible.
Big Data Analytics: Hadoop 33
HDFS Commands: Hadoop Shell
Commands to Manage HDFS
Following are some frequently used HDFS
commands:
fsck HDFS command to check the health of the Hadoop
file system.
Command: hdfs fsck /
ls HDFS Command to display the list of Files and
Directories in HDFS
Command: hdfs dfs –ls /
mkdir HDFS Command to create the directory in HDFS.
Command: hdfs dfs –mkdir /directory_name
Big Data Analytics: Hadoop 34
HDFS Commands: Hadoop Shell
Commands to Manage HDFS
Following are some frequently used HDFS
commands:
touchz HDFS command to check create file in HDFS with
file size 0 bytes.
Command: hdfs dfs –touchz /directory/filename
du HDFS Command to check file size
Command: hdfs dfs –du –s /directory/filename
text HDFS Command that takes source file and outputs
the file in text format
Command: hdfs dfs –text /directory/filename
Big Data Analytics: Hadoop 35
HDFS Commands: Hadoop Shell
Commands to Manage HDFS
Following are some frequently used HDFS commands:
copyFromLocal HDFS command to copy the file from local file
system to HDFS
Command:
hdfs dfs –copyFromLocal <localsrc> <hdfs
destination>
copyToLocal HDFS Command to copy file from HDFS to local
file system
Command:
hdfs dfs –copyToLocal <hdfs source> <localdst>
Big Data Analytics: Hadoop 36
HDFS Commands: Hadoop Shell
Commands to Manage HDFS
Following are some frequently used HDFS commands:
put HDFS command to copy single source or multiple sources
from local file system to the destination file system.
Command: hdfs dfs –put <localsrc> <destination>
get HDFS Command to copy files from hdfs to local file system
Command: hdfs dfs –get <src> <localdst>
count HDFS Command to count number of directories, files, and
bytes, under the paths that match the specified file
pattern.
Command: hdfs dfs –count <path>
37
Big Data Analytics: Hadoop
HDFS Commands: Hadoop Shell
Commands to Manage HDFS
• Following are some frequently used HDFS
commands
rm HDFS command to remove file from HDFS
Command: hdfs dfs –rm <path>
rm -r HDFS command to remove entire directory and all
its content from HDFS
Command: hdfs dfs –rm –r <path>
cp HDFS command to copy files from source to
destination.
Command: hdfs dfs –cp <src> <dest>
Big Data Analytics: Hadoop 38
HDFS Commands: Hadoop Shell
Commands to Manage HDFS
• Following are some frequently used HDFS
commands
mv HDFS command to remove files from source to
destination
Command: hdfs dfs –mv <src> <dest>
expunge HDFS command that makes the trash empty
Command: hdfs dfs –expunge
rmdir HDFS command to remove directory
Command: hdfs dfs –rmdir <path>
Big Data Analytics: Hadoop 39
HDFS Commands: Hadoop Shell
Commands to Manage HDFS
• Following are some frequently used HDFS
commands
usage HDFS command that returns the help for an
individual command.
Command: hdfs dfs –usage <command>
help HDFS Command that displays help for given
command or all commands if none is specified.
Command: hdfs dfs -help
Big Data Analytics: Hadoop 40
HDFS Commands: Hadoop
Shell Commands to Manage
HDFS
Following are some frequently used HDFS
commands:
fsck: HDFS command to check the health of the
Hadoop file system.
Big Data Analytics: Hadoop 41
References
• https://hadoop.apache.org/docs/r1.2.1/hdfs_design.html#:~:text
=NameNode%20and%20DataNodes,-
HDFS%20has%20a&text=An%20HDFS%20cluster%20consists%20
of,nodes%20that%20they%20run%20on.
• https://www.edureka.co/blog/apache-hadoop-hdfs-architecture/
• https://data-flair.training/blogs/rack-awareness-hadoop-hdfs/
• https://hadoop.apache.org/docs/r1.2.1/hdfs_user_guide.html#:~
:text=Secondary%20NameNode,-
The%20NameNode%20stores&text=Since%20NameNode%20mer
ges%20fsimage%20and,restart%20of%20NameNode%20takes%2
0longer.
• https://www.edureka.co/blog/hdfs-commands-hadoop-shell-
command
Big Data Analytics: Hadoop 44