Hadoop – hdfs command to list files in HDFS directory as per timestamp

hadoophdfs

Is there a hdfs command to list files in HDFS directory as per timestamp, ascending or descending? By default, hdfs dfs -ls command gives unsorted list of files.

When I searched for answers what I got was a workaround i.e. hdfs dfs -ls /tmp | sort -k6,7. But is there any better way, inbuilt in hdfs dfs commandline?

Best Answer

No, there is no other option to sort the files based on datetime.
If you are using hadoop version < 2.7, you will have to use sort -k6,7 as you are doing:

hdfs dfs -ls /tmp | sort -k6,7

And for hadoop 2.7.x ls command , there are following options available :

Usage: hadoop fs -ls [-d] [-h] [-R] [-t] [-S] [-r] [-u] <args>

Options:
-d: Directories are listed as plain files.
-h: Format file sizes in a human-readable fashion (eg 64.0m instead of 67108864).
-R: Recursively list subdirectories encountered.
-t: Sort output by modification time (most recent first).
-S: Sort output by file size.
-r: Reverse the sort order.
-u: Use access time rather than modification time for display and sorting.

So you can easily sort the files:

hdfs dfs -ls -t -R (-r) /tmp 
Related Topic