12.5.15


  • InputFormat
    • SequenceFileInputFormat
    • SequenceFileInputFormat
    • TextInputFormat
    • ObjectPositionInputFormat
    • FileInputFormat
  • ToolRunner
  • LocalJobRunner
  • job.setNumReduceTasks(8);job.setPartitionerClass(MyPartitioner.class);
  • Write a custom FileInputFormat and override the method isSplitable to always return false
  • MRUnit 
  • hadoop fs -setrep 4 f1
    • hadoop fs -Ddfs.replication=4 -cp f1 f1.tmp; hadoop fs -rm f1; hadoop fs -mv f1.tmp f1
  • How are keys and values presented and passed to the reduce() method during a standard shuffle and sort phase of MapReduce?
  • Does the MapReduce programming model provide a way for reduce tasks to communicate with each other?
  • Speculative Execution
  • WritableComparable
  • Workflow of Oozie
  • A single map task processes <>?
  • Hive table field delimiters
  • When testing a Reducer using MRUnit, you should only pass the Reducer a single key and list of values. In this case, we use the withInput() method twice, but only the second call will actually be used -- the first will be overridden by the second. If you want to test the Reducer with two inputs, you would have to write two tests.

No comments: