edu.yale.cs.hadoopdb.dataloader
Class GlobalHasher

java.lang.Object
  extended by org.apache.hadoop.conf.Configured
      extended by edu.yale.cs.hadoopdb.exec.HDFSJobBase
          extended by edu.yale.cs.hadoopdb.dataloader.GlobalHasher
All Implemented Interfaces:
org.apache.hadoop.conf.Configurable, org.apache.hadoop.util.Tool

public class GlobalHasher
extends HDFSJobBase

Hash-partitions files stored in HDFS into the specified number of partitions. Each line of text is assumed to be a record with fields delimited with character given as a param. The fields on which record is hashed is expected to be an index (0 = the first field in a record).


Nested Class Summary
(package private) static class GlobalHasher.Map
           
static class GlobalHasher.Reduce
           
private static class GlobalHasher.UnsortableInt
           
 
Field Summary
static java.lang.String DELIMITER_PARAM
           
static java.lang.String HASH_FIELD_POS_PARAM
           
 
Constructor Summary
GlobalHasher()
           
 
Method Summary
protected  org.apache.hadoop.mapred.JobConf configureJob(java.lang.String... args)
          Override this method to set job-specific options
private static int hash(java.lang.String s)
           
static void main(java.lang.String[] args)
           
protected  int printUsage()
          Provide job-specific command-line help
 
Methods inherited from class edu.yale.cs.hadoopdb.exec.HDFSJobBase
printHDFSUsage, run
 
Methods inherited from class org.apache.hadoop.conf.Configured
getConf, setConf
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 
Methods inherited from interface org.apache.hadoop.conf.Configurable
getConf, setConf
 

Field Detail

DELIMITER_PARAM

public static final java.lang.String DELIMITER_PARAM
See Also:
Constant Field Values

HASH_FIELD_POS_PARAM

public static final java.lang.String HASH_FIELD_POS_PARAM
See Also:
Constant Field Values
Constructor Detail

GlobalHasher

public GlobalHasher()
Method Detail

configureJob

protected org.apache.hadoop.mapred.JobConf configureJob(java.lang.String... args)
                                                 throws java.lang.Exception
Description copied from class: HDFSJobBase
Override this method to set job-specific options

Specified by:
configureJob in class HDFSJobBase
Throws:
java.lang.Exception

hash

private static int hash(java.lang.String s)

main

public static void main(java.lang.String[] args)
                 throws java.lang.Exception
Throws:
java.lang.Exception

printUsage

protected int printUsage()
Description copied from class: HDFSJobBase
Provide job-specific command-line help

Specified by:
printUsage in class HDFSJobBase