Home
Free eBook
Start Here
Contact
About
Quick Apache Hadoop Admin Command Reference
Examples
by KARTHIKEYAN SADHASIVAM on FEBRUARY 18, 2015
If you are working on Hadoop, youll realize
there are several shell commands available to manage your hadoop cluster.
This article provides a quick handy reference to all Hadoop administration commands.
If you are new to big data, read the introduction to Hadoop article to understand the
basics.
1. Hadoop Namenode Commands
Command
Description
hadoop namenode -format
hadoop namenode -upgrade
start-dfs.sh
stop-dfs.sh
start-mapred.sh
stop-mapred.sh
hadoop namenode -recover
Format HDFS filesystem from Namenode
Upgrade the NameNode
Start HDFS Daemons
Stop HDFS Daemons
Start MapReduce Daemons
Stop MapReduce Daemons
Recover namenode metadata after a cluster failure
-force
(may lose data)
2. Hadoop fsck Commands
Command
Description
hadoop fsck /
hadoop fsck / -files
hadoop fsck / -files -blocks
hadoop fsck / -files -blocks
-locations
hadoop fsck / -files -blocks
-locations -racks
hadoop fsck -delete
Filesystem check on HDFS
Display files during check
Display files and blocks during check
Display files, blocks and its location
during check
Display network topology for data-node
locations
Delete corrupted files
Move corrupted files to /lost+found
directory
hadoop fsck -move
3. Hadoop Job Commands
Command
hadoop job -submit <jobfile>
hadoop job -status <jobid>
hadoop job -list all
hadoop job -list-activetrackers
Description
Submit the job
Print job status completion percentage
List all jobs
List all available TaskTrackers
hadoop job -set-priority
<job-id> <priority>
Set priority for a job. Valid priorities:
VERY_HIGH, HIGH, NORMAL, LOW,
VERY_LOW
hadoop job -kill-task
<task-id>
Kill a task
hadoop job -history
Display job history including job details, failed and
killed jobs
4. Hadoop dfsadmin Commands
Command
Description
hadoop dfsadmin -report
hadoop dfsadmin
-metasave file.txt
hadoop dfsadmin
-setQuota 10 /quotatest
hadoop dfsadmin
-clrQuota /quotatest
Report filesystem info and statistics
hadoop dfsadmin
-refreshNodes
hadoop fs -count -q
/mydir
hadoop dfsadmin
-setSpaceQuota /mydir
100M
hadoop dfsadmin
-clrSpaceQuota /mydir
hadooop dfsadmin
-saveNameSpace
Save namenodes primary data structures to file.txt
Set Hadoop directory quota to only 10 files
Clear Hadoop directory quota
Read hosts and exclude files to update datanodes that
are allowed to connect to namenode. Mostly used to
commission or decommsion nodes
Check quota space on directory /mydir
Set quota to 100M on hdfs directory named /mydir
Clear quota on a HDFS directory
Backup Metadata (fsimage & edits). Put cluster in
safe mode before this command.
5. Hadoop Safe Mode (Maintenance Mode) Commands
The following dfsadmin commands helps the cluster to enter or leave safe mode, which
is also called as maintenance mode. In this mode, Namenode does not accept any
changes to the name space, it does not replicate or delete blocks.
Command
hadoop dfsadmin -safemode
enter
hadoop dfsadmin -safemode
leave
hadoop dfsadmin -safemode get
Description
Enter safe mode
Leave safe mode
Get the status of mode
Wait until HDFS finishes data block
hadoop dfsadmin -safemode wait
replication
6. Hadoop Configuration Files
File
Description
hadoop-env.sh
core-site.xml
hdfs-site.xml
mapred-site.xml
masters
slaves
Sets ENV variables for Hadoop
Parameters for entire Hadoop cluster
Parameters for HDFS and its clients
Parameters for MapReduce and its clients
Host machines for secondary Namenode
List of slave hosts
7. Hadoop mradmin Commands
Command
Description
hadoop mradmin -safemode get
hadoop mradmin -refreshQueues
hadoop mradmin -refreshNodes
Check Job tracker status
Reload mapreduce configuration
Reload active TaskTrackers
Force Jobtracker to reload service
ACL
Force jobtracker to reload user
group mappings
hadoop mradmin -refreshServiceAcl
hadoop mradmin
-refreshUserToGroupsMappings
8. Hadoop Balancer Commands
Command
Description
start-balancer.sh
hadoop dfsadmin -setBalancerBandwidth
<bandwidthinbytes>
Balance the cluster
Adjust bandwidth used by the
balancer
Limit balancing to only 20%
resources in the cluster
hadoop balancer -threshold 20
9. Hadoop Filesystem Commands
Command
Description
hadoop fs -mkdir mydir
hadoop fs -ls
Create a directory (mydir) in HDFS
List files and directories in HDFS
hadoop fs -cat myfile
hadoop fs -du
hadoop fs -expunge
hadoop fs -chgrp hadoop file1
hadoop fs -chown huser file1
hadoop fs -rm file1
hadoop fs -touchz file2
hadoop fs -stat file1
hadoop fs -test -e file1
hadoop fs -test -z file1
hadoop fs -test -d file1
View a file content
Check disk space usage in HDFS
Empty trash on HDFS
Change group membership of a file
Change file ownership
Delete a file in HDFS
Create an empty file
Check the status of a file
Check if file exists on HDFS
Check if file is empty on HDFS
Check if file1 is a directory on HDFS
10. Additional Hadoop Filesystem Commands
Command
Description
hadoop fs -copyFromLocal <source>
<destination>
Copy from local fileystem to
HDFS
e.g: Copies file1 from local FS
to data dir in HDFS
copy from hdfs to local
filesystem
e.g: Copies file1 from HDFS
data directory to /var/tmp on
local FS
Copy from remote location to
HDFS
Copy from HDFS to remote
directory
Copy data from one cluster to
another using the cluster URL
Move data file from the local
directory to HDFS
Set the replication factor for
file1 to 3
Merge files in mydir directory
and download it as one big file
hadoop fs -copyFromLocal file1 data
hadoop fs -copyToLocal <source>
<destination>
hadoop fs -copyToLocal data/file1 /var/tmp
hadoop fs -put <source> <destination>
hadoop fs -get <source> <destination>
hadoop distcp hdfs://192.168.0.8:8020/input
hdfs://192.168.0.8:8020/output
hadoop fs -mv file:///data/datafile
/user/hduser/data
hadoop fs -setrep -w 3 file1
hadoop fs -getmerge mydir bigfile