问题描述:

I have created a HAR file containing multiple small input files. For running a map reduce job with a single input file, this willbe the command:

hadoop jar <jarname> <packagename.classname> <input> <output>

But if in case the above <input> is a HAR file the what will be the command such that all the contents of the HAR file is considered as input?

网友答案:

If the input is a HAR file then in the place of input the following has to be given

har:///hdfs path to har file

Since hadoop archives will be exposed as filesystem, mapreduce will be able to use all the files in hadoop archives as input.

相关阅读:
Top