12.5.15

Implicities in Hadoop Jobs

1. If no mapper is set, IdentityMapper will be used, or Mapper will be used since version <version?>.
2. If IdentityMapper is used and the job doesn't set up mapper output classes, classes used for output of reducers will be reused for mapper's output, which can cause problem -> IOException instead of ClassCastException:

MapTask.class (2.6.0)


    public synchronized void collect(K key, V value, final int partition
                                     ) throws IOException {
      reporter.progress();
      if (key.getClass() != keyClass) {
        throw new IOException("Type mismatch in key from map: expected "
                              + keyClass.getName() + ", received "
                              + key.getClass().getName());
      }
      if (value.getClass() != valClass) {
        throw new IOException("Type mismatch in value from map: expected "
                              + valClass.getName() + ", received "
                              + value.getClass().getName());
      }
      if (partition < 0 || partition >= partitions) {
        throw new IOException("Illegal partition for " + key + " (" +
            partition + ")");
      }



3. Default file location for Hive DB and Tables:

/user/hive/warehouse/<db>.db/<table>

No comments: