recent

hadoop fs commands

Hadoop Fs Commands directly interact with the Hadoop Distributed File System,HDFS Commands also interact with other file systems that Hadoop supports,such as Local FS, HFTP FS, S3 FS,and others.

MoveToLocal
Usagehadoop fs -moveToLocal [-crc] <src> <dst>
To Move File From HDFS to his Local File System
Displays a “Not implemented yet” message.

MoveFromLocal
Usage: hadoop fs -moveFromLocal <localsrc> <dst>
Example:
hadoop fs -moveFromLocal products /user/hadoop/hadoopdemo
hadoop fs -mv /user/hadoop/SrcFile /user/hadoop/TgtFile
hadoop fs -mv /user/hadoop/file1 /user/hadoop/file2 hdfs://namenodehost/user/hadoop/TgtDirctry
--The hadoop MoveFromLocal command moves a file from local file system to the hdfs directory.
Similar to Put command(Hadoop Mv command can also be used to move multiple source files into the target directory), except that the source localsrc(original source file) is deleted after it’s copied.

Mv
Usage: hadoop fs -mv URI [URI ...] <dest>
Example:
hadoop fs -mv /user/hadoop/SrcFile /user/hadoop/TgtFile
hadoop fs -mv /user/hadoop/file1 /user/hadoop/file2
hadoop fs -mv /user/hadoop/file1 /user/hadoop/file2 hdfs://namenodehost/user/hadoop/TgtDirectory
hadoop fs -mv hdfs://nn.example.com/file1 hdfs://nn.example.com/file2 hdfs://nn.example.com/file3 hdfs://nn.example.com/dir1
--Moves one or more files files from source to destination. If you specify multiple sources, the specified destination must be a directory. Moving files across file systems isn't permitted

Put
Usage: hadoop fs -put <localsrc> ... <dst>
Example:
hadoop fs -put localfile /user/hadoop/hadoopfile
hadoop fs -put localfile1 localfile2 /user/hadoop/hadoopdir
hadoop fs -put localfile hdfs://nn.example.com/hadoop/hadoopfile
hadoop fs -put - hdfs://nn.example.com/hadoop/hadoopfile Reads the input from stdin.
Syntax1: copy single file to hdfs
hadoop fs -put localfile /user/hadoop/hadoopdemo
Syntax2copy multiple files to hdfs
hadoop fs -put localfile1 localfile2 /user/hadoop/hadoopdemo
Syntax3Read input file name from stdin
hadoop fs -put - hdfs://namenodehost/user/hadoop/hadoopdemo
Copy single src, or multiple srcs from local file system to the destination file system. Also reads input from stdin and writes to destination file system.

Usage: hadoop fs -rm [-f] [-r |-R] [-skipTrash] URI [URI ...]
Example:
hadoop fs -rm /user/hadoop/file
hadoop fs -rm hdfs://nn.example.com/file /user/hadoop/emptydir
--Removes/Delete the specified list of files,directory and Only deletes non empty directory and files
Options:
The -f option will not display a diagnostic message or modify the exit status to reflect an error if the file does not exist.
The -R option deletes the directory and any content under it recursively.
The -r option is equivalent to -R.
The -skipTrash option will bypass trash, if enabled, and delete the specified file(s) immediately. This can be useful when it is necessary to delete files from an over-quota directory.

rmr
Usage: hadoop fs -rmr [-skipTrash] URI [URI ...]
Example:
hadoop fs -rmr /user/hadoop/dir
--Recursively deletes the files and sub directories
Note: This command is deprecated. Instead use hadoop fs -rm -r

rmdir
Usagehadoop fs -rmdir [--ignore-fail-on-non-empty] URI [URI ...]
Example:
hadoop fs -rmr /user/hadoop/dir
--Delete the directory.
Options:
--ignore-fail-on-non-empty: When using wildcards, do not fail if a directory still contains files.

setfattr
Usage: hadoop fs -setfattr -n name [-v value] | -x name <path>
Examples:
hadoop fs -setfattr -n user.myAttr -v myValue /file
hadoop fs -setfattr -n user.noValue /file
hadoop fs -setfattr -x user.myAttr /file
--Sets an extended attribute name and value for a file or directory.
Options:
-b: Remove all but the base ACL entries. The entries for user, group and others are retained for compatibility with permission bits.
-n name: The extended attribute name.
-v value: The extended attribute value. There are three different encoding methods for the value. If the argument is enclosed in double quotes, then the value is the string inside the quotes. If the argument is prefixed with 0x or 0X, then it is taken as a hexadecimal number. If the argument begins with 0s or 0S, then it is taken as a base64 encoding.
-x name: Remove the extended attribute.
path: The file or directory.

setfacl
Usage: hadoop fs -setfacl [-R] [-b |-k -m |-x <acl_spec> <path>] |[--set <acl_spec> <path>]
Examples:
hadoop fs -setfacl -m user:hadoop:rw- /file
hadoop fs -setfacl -x user:hadoop /file
hadoop fs -setfacl -b /file
hadoop fs -setfacl -k /dir
hadoop fs -setfacl --set user::rw-,user:hadoop:rw-,group::r--,other::r-- /file
hadoop fs -setfacl -R -m user:hadoop:r-x /dir
hadoop fs -setfacl -m default:user:hadoop:r-x /dir
--Sets Access Control Lists (ACLs) of files and directories.
Options:
-b: Remove all but the base ACL entries. The entries for user, group and others are retained for compatibility with permission bits.
-k: Remove the default ACL.
-R: Apply operations to all files and directories recursively.
-m: Modify ACL. New entries are added to the ACL, and existing entries are retained.
-x: Remove specified ACL entries. Other ACL entries are retained.
--set: Fully replace the ACL, discarding all existing entries. The acl_spec must include entries for user, group, and others for compatibility with permission bits.
acl_spec: Comma separated list of ACL entries.
path: File or directory to modify.

setrep
Usage: hadoop fs -setrep [-R] [-w] <numReplicas> <path>
Example:
hadoop fs -setrep -w 3 /user/hadoop/dir1
--Changes the replication factor of a file. If path is a directory then the command recursively changes the replication factor of all files under the directory tree rooted at path.
Options:
The -w flag requests that the command wait for the replication to complete. potentially take a very long time.
The -R flag is accepted for backwards compatibility. It has no effect.

stat
Usage: hadoop fs -stat [format] <path> .
Example:
hadoop fs -stat "%F %u:%g %b %y %n" /file..
--Print statistics about the file/directory at <path> in the specified format.
Format accepts file size in blocks (%b)
Format accepts file size in type (%F)
Format accepts file size in group name of owner (%g)
Format accepts file size in name (%n)
Format accepts file size in block size (%o)
Format accepts file size in replication (%r)
Format accepts file size in user name of owner(%u)
Format accepts file size in modification date (%y, %Y)
%y shows UTC date as “yyyy-MM-dd HH:mm:ss” and %Y shows milliseconds since January 1, 1970 UTC. If the format is not specified, %y is used by default.

tail
Usage: hadoop fs -tail [-f] URI
Example:
hadoop fs -tail pathname
--Displays last kilobyte of the file to stdout.
Options:
The -f option will output appended data as the file grows, as in Unix.

test
Usage: hadoop fs -test -[defsz] URI
Example:
hadoop fs -test -e filename
Options:
-d: f the path is a directory, return 0.
-e: if the path exists, return 0.
-f: if the path is a file, return 0.
-s: if the path is not empty, return 0.
-z: if the file is zero length, return 0.

text
Usage: hadoop fs -text <src>
Takes a source file and outputs the file in text format. The allowed formats are zip and TextRecordInputStream.

touchz
Usage: hadoop fs -touchz URI [URI ...]
Example:
hadoop fs -touchz pathname
--Create a file of zero length.

usage
Usage: hadoop fs -usage command
--Return the help for an individual command.

Usage: hadoop fs -getmerge <src> <localdst> [addnl]
Example:
hadoop fs -getmerge <dir_of_input_files> <mergedsinglefile>
--Takes a source directory and a destination file as input and concatenates files in src into the destination local file. Optionally addnl can be set to enable adding a newline character at the end of each file.

help
Usage: hadoop fs -help
--Return usage output.
hadoop fs [GENERIC_OPTIONS] [COMMAND_OPTIONS] .... -help [cmd]
(Displays help for the given command or all commands if none is specified)

mkdir
Usage: hadoop fs -mkdir [-p] <paths>
Example:
 hadoop fs -mkdir /user/hadoop/dir1 /user/hadoop/dir2
 hadoop fs -mkdir hdfs://nn1.example.com/user/hadoop/dir
--Takes path uri’s as argument and creates directories.
Options:
The -p option behavior is much like Unix mkdir -p, creating parent directories along the path.

truncate
Usage: hadoop fs -truncate [-w] <length> <paths>
Example:
hadoop fs -truncate 55 /user/hadoop/file1 /user/hadoop/file2
hadoop fs -truncate -w 127 hdfs://nn1.example.com/user/hadoop/file1
--Truncate all files that match the specified file pattern to the specified length.
Options:
The -w flag requests that the command waits for block recovery to complete, if necessary. Without -w flag the file may remain unclosed for some time while the recovery is in progress. During this time file cannot be reopened for append.

lsr
Usage: hadoop fs -lsr <args>
Example:
hadoop fs -lsr /user/hadoop/dir
Found 2 items
drwxr-xr-x   - hadoop hadoop  0 2015-12-10 09:47 /user/hadoop/dir/products
-rw-r--r--   2 hadoop hadoop    1971684 2015-12-10 09:47 /user/hadoop/dir/products/products.dat
--Recursive version of ls.
--The hadoop lsr command recursively displays the directories, sub directories and files in the specified directory. The usage example is shown below:
Note: This command is deprecated. Instead use hadoop fs -ls -R

ls
Usage: hadoop fs -ls [-d] [-h] [-R] [-t] [-S] [-r] [-u] <args>
Example: 
hadoop fs -ls /user/hadoop/dir1 /user/hadoop/dir2
hadoop fs -ls /user/hadoop/dir1/filename.txt
hadoop fs -ls hdfs://<hostname>:9000/user/hadoop/dir1/
hadoop fs -ls /user/hadoop/file1
Options:
-d: Directories are listed as plain files.
-h: Format file sizes in a human-readable fashion (eg 64.0m instead of 67108864).
-R: Recursively list subdirectories encountered.
-t: Sort output by modification time (most recent first).
-S: Sort output by file size.
-r: Reverse the sort order.
-u: Use access time rather than modification time for display and sorting.

hadoop fs -getfacl
Usagehadoop fs -getfacl [-R] <path>
Examples:
hadoop fs -getfacl /file
hadoop fs -getfacl -R /dir
--Displays the Access Control Lists (ACLs) of files and directories. If a directory has a default ACL, then getfacl also displays the default ACL.
Options:
-R: List the ACLs of all files and directories recursively.
path: File or directory to list.

Usagehadoop fs -get [-ignorecrc] [-crc] <src> <localdst>
Example:
hadoop fs -get /user/hadoop/file localfile
hadoop fs -get hdfs://nn.example.com/user/hadoop/file localfile
--Copy files to the local file system. Files that fail the CRC check may be copied with the -ignorecrc option. Files and CRCs may be copied using the -crc option.

Usage: hadoop fs -find <path> ... <expression> ...
Example:
hadoop fs -find / -name test -print
--Finds all files that match the specified expression and applies selected actions to them. If no path is specified then defaults to the current working directory. If no expression is specified then defaults to -print.

Usage: hadoop fs -getfattr [-R] -n name | -d [-e en] <path>
Examples:
hadoop fs -getfattr -d /file
hadoop fs -getfattr -R -n user.myAttr /dir
--Displays the extended attribute names and values (if any) for a file or directory.
Options:
-d: Dump all extended attribute values associated with pathname.
-R: Recursively list the attributes for all files and directories.
-n name: Dump the named extended attribute value.
-e encoding: Encode values after retrieving them.

Usage: hadoop fs -expunge
--Empty the Trash

dus
Usage: hadoop fs -dus <args>
--Displays a summary of file lengths
Dus command show the amount of space, in bytes, used by the files that match the specified file pattern. This command is equivalent to the unix command "du -sb"  The output is in the form name(full path) size (in bytes)
Note: This command is deprecated. Instead use hadoop fs -du -s.

du
Usage: hadoop fs -du [-s] [-h] URI [URI ...]
Example:
hadoop fs -du /user/hadoop/dir1 /user/hadoop/file1 hdfs://nn.example.com/user/hadoop/dir1
--Displays sizes of files and directories contained in the given directory or the length of a file in case its just a file
du command shows the amount of space, in bytes, used by the files that match the specified file pattern. This command is equivalent to the unix command "du -sb <path>/*" in case of a directory, and to "du -b <path>" in case of a file. The output is in the form name(full path) size (in bytes).
Options:
The -s option will result in an aggregate summary of file lengths being displayed, rather than the individual files.
The -h option will format file sizes in a “human-readable” fashion (e.g 64.0m instead of 67108864)

df
Usage: hadoop fs -df [-h] URI [URI ...]
(The -h option will format file sizes in a “human-readable” fashion (e.g 64.0m instead of 67108864)
Example:
hadoop dfs -df /user/hadoop/dir1
--Displays free space
This command shows the capacity, free and used space of the filesystem. If the filesystem has multiple partitions, and no path to a particular partition is specified, then the status of the root partitions will be shown.

cp
Usage: hadoop fs -cp [-f] [-p | -p[topax]] URI [URI ...] <dest>
Example:
hadoop fs -cp /user/hadoop/file1 /user/hadoop/file2
hadoop fs -cp /user/hadoop/file1 /user/hadoop/file2 /user/hadoop/dir
--Copy files from source to destination. This command allows multiple sources as well in which case the destination must be a directory.

‘raw.*’ namespace extended attributes are preserved if (1) the source and destination filesystems support them (HDFS only), and (2) all source and destination pathnames are in the /.reserved/raw hierarchy. Determination of whether raw.* namespace xattrs are preserved is independent of the -p (preserve) flag.
Options:
The -f option will overwrite the destination if it already exists.
The -p option will preserve file attributes [topx] (timestamps, ownership, permission, ACL, XAttr). If -p is specified with no arg, then preserves timestamps, ownership, permission. If -pa is specified, then preserves permission also because ACL is a super-set of permission. Determination of whether raw namespace extended attributes are preserved is independent of the -p flag.

count
Usage: hadoop fs -count [-q] [-h] [-v] <paths>
Example:
hadoop fs -count hdfs:/
hadoop fs -count hdfs://nn1.example.com/file1 hdfs://nn2.example.com/file2
hadoop fs -count -q hdfs://nn1.example.com/file1
hadoop fs -count -q -h hdfs://nn1.example.com/file1  (The -h option shows sizes in human readable format)
hdfs dfs -count -q -h -v hdfs://nn1.example.com/file1  (The -v option displays a header line)
--Count the number of directories, files and bytes under the paths that match the specified file pattern. The output columns with -count are: DIR_COUNT, FILE_COUNT, CONTENT_SIZE, PATHNAME
--The output columns with -count -q are: QUOTA, REMAINING_QUATA, SPACE_QUOTA, REMAINING_SPACE_QUOTA, DIR_COUNT, FILE_COUNT, CONTENT_SIZE, PATHNAME

chown
Usage: hadoop fs -chown [-R] [OWNER][:[GROUP]] URI [URI ]
sudo -u hdfs hadoop fs -chown root:root hadoop/chown.txt
Read hadoop examples jar
--Change the owner of files. The user must be a super-user. Use ‘-chown’ to change owner name and group name simultaneously
Options
The -R option will make the change recursively through the directory structure.

chmod
Usage: hadoop fs -chmod [-R] <MODE[,MODE]... | OCTALMODE> URI [URI ...]
Know about hadoop longwritable
hadoop fs -chmod [-R] command Change the permissions of files. With -R, make the change recursively through the directory structure. The user must be the owner of the file, or else a super-user.
chmod [-R] mode file:-
Only the owner of a file or the super-user is permitted to change the mode of a file.
--The -R option will make the change recursively through the directory structure.

CopyFromLocal
Usage: hadoop fs - copyFromLocal <localsrc> URI
Example:
bin/hadoop dfs -copyFromLocal <local_FS_filename> <target_on_HDFS>
bin/hadoop dfs -copyFromLocal /home/delu/ml-100k/u.data my.data

--Know what is hadoop stringtokenizer
Hadoop copyFromLocal command Similar to put command (copyFromLocal and put command), except that the source is restricted to a local file reference.
Options:
The -f option will overwrite the destination if it already exists.

CopyToLocal
Usage: hadoop fs - copyFromLocal <localsrc> URI
Example:
bin/hadoop dfs -copyFromLocal <local_FS_filename> <target_on_HDFS>
bin/hadoop dfs -copyFromLocal /home/delu/ml-100k/u.data my.data

Hadoop copyFromLocal command Similar to put command (copyFromLocal and put command), except that the source is restricted to a local file reference.
Options:
The -f option will overwrite the destination if it already exists.

chgrp

Usage: hadoop fs -chgrp [-R] GROUP URI [URI ...]
--Change group association of files. The user must be the owner of files, or else a super-user. Additional information is in the Permissions Guide.
Options
The -R option will make the change recursively through the directory structure.

checksum
Usage: hadoop fs -checksum URI
Example:
hadoop fs -checksum hdfs://nn1.example.com/file1
hadoop fs -checksum file:///etc/hosts
--Returns the checksum information of a file.

cat
Usage: hadoop fs -cat URI [URI ...]
Example:
hadoop fs -cat hdfs://nn1.example.com/file1 hdfs://nn2.example.com/file2
hadoop fs -cat file:///file3 /user/hadoop/file4
hadoop fs -cat /hdfs_dir/* >> /local_dir/localfile.txt
--haadoop fs -cat command will display the content of the HDFS file on your stdout (console or command prompt).
or Copies source paths to stdout.

AppendToFile
Usage: hadoop fs -appendToFile <localsrc> ... <dst>
hadoop fs -appendToFile localfile /user/hadoop/hadoopfile
hadoop fs -appendToFile localfile1 localfile2 /user/hadoop/hadoopfile
hadoop fs -appendToFile localfile hdfs://nn.example.com/hadoop/hadoopfile
hadoop fs -appendToFile - hdfs://nn.example.com/hadoop/hadoopfile Reads the input from stdin.
--Command hadoop fs -appendToFile Append single source, or multiple sources from local file system to the destination file system. Also reads input from stdin(standard Input/Output) and appends to destination file system.

Powered by Blogger.