Hadoop sort example fails with ‘not a SequenceFile’. How set the SequenceFile

hadoop

I'm trying to run bin/hadoop jar hadoop-examples-1.0.4.jar sort input output

but get an error "java.io.IOException: hdfs://master:9000/usr/ubuntu/input/file1 not a SequenceFile"

If I run bin/hadoop jar hadoop-examples-1.0.4.jar wordcount input output It's work.

So I can't figure out how to deal with it

Best Answer

The error message here is exactly right; the sort example is expecting a sequence file - a flat file of binary keys and values as input, the kind that are often generated as output from MapReduce jobs.

However, the wordcount example is not expecting a sequence file in particular as input, merely a text file which is read in with the keys being the offset (line number) into the file, with the value being the line content.

Seeing as the input files you have are not sequence files per se, sort cannot run using them.

Related Topic