edu.yale.cs.hadoopdb.sms.connector
Class SMSInputFormat

java.lang.Object
  extended by org.apache.hadoop.mapred.FileInputFormat<org.apache.hadoop.io.LongWritable,org.apache.hadoop.io.Text>
      extended by edu.yale.cs.hadoopdb.sms.connector.SMSInputFormat
All Implemented Interfaces:
org.apache.hadoop.mapred.InputFormat<org.apache.hadoop.io.LongWritable,org.apache.hadoop.io.Text>, org.apache.hadoop.mapred.JobConfigurable

public class SMSInputFormat
extends org.apache.hadoop.mapred.FileInputFormat<org.apache.hadoop.io.LongWritable,org.apache.hadoop.io.Text>
implements org.apache.hadoop.mapred.JobConfigurable

SMSInputFormat extends FileInputFormat to allow access to path information set by Hive and needed to recreate Map operators.


Field Summary
protected  org.apache.hadoop.mapred.JobConf conf
           
static java.lang.String DB_QUERY_SCHEMA_PREFIX
           
static java.lang.String DB_SQL_QUERY_PREFIX
           
static org.apache.commons.logging.Log LOG
           
 java.util.Map<java.lang.String,SMSConfiguration> rel_DBConf
           
 java.lang.String relation
           
 
Constructor Summary
SMSInputFormat()
           
 
Method Summary
 void configure(org.apache.hadoop.mapred.JobConf conf)
           
 org.apache.hadoop.mapred.RecordReader<org.apache.hadoop.io.LongWritable,org.apache.hadoop.io.Text> getRecordReader(org.apache.hadoop.mapred.InputSplit split, org.apache.hadoop.mapred.JobConf conf, org.apache.hadoop.mapred.Reporter reporter)
           
 org.apache.hadoop.mapred.InputSplit[] getSplits(org.apache.hadoop.mapred.JobConf job, int numSplits)
          Retrieves path information from FileInputFormat super class and then relation from path using the job configuration.
 
Methods inherited from class org.apache.hadoop.mapred.FileInputFormat
addInputPath, addInputPaths, computeSplitSize, getBlockIndex, getInputPathFilter, getInputPaths, isSplitable, listStatus, setInputPathFilter, setInputPaths, setInputPaths, setMinSplitSize
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

conf

protected org.apache.hadoop.mapred.JobConf conf

DB_QUERY_SCHEMA_PREFIX

public static final java.lang.String DB_QUERY_SCHEMA_PREFIX
See Also:
Constant Field Values

DB_SQL_QUERY_PREFIX

public static final java.lang.String DB_SQL_QUERY_PREFIX
See Also:
Constant Field Values

LOG

public static final org.apache.commons.logging.Log LOG

rel_DBConf

public java.util.Map<java.lang.String,SMSConfiguration> rel_DBConf

relation

public java.lang.String relation
Constructor Detail

SMSInputFormat

public SMSInputFormat()
Method Detail

configure

public void configure(org.apache.hadoop.mapred.JobConf conf)
Specified by:
configure in interface org.apache.hadoop.mapred.JobConfigurable

getRecordReader

public org.apache.hadoop.mapred.RecordReader<org.apache.hadoop.io.LongWritable,org.apache.hadoop.io.Text> getRecordReader(org.apache.hadoop.mapred.InputSplit split,
                                                                                                                          org.apache.hadoop.mapred.JobConf conf,
                                                                                                                          org.apache.hadoop.mapred.Reporter reporter)
                                                                                                                   throws java.io.IOException
Specified by:
getRecordReader in interface org.apache.hadoop.mapred.InputFormat<org.apache.hadoop.io.LongWritable,org.apache.hadoop.io.Text>
Specified by:
getRecordReader in class org.apache.hadoop.mapred.FileInputFormat<org.apache.hadoop.io.LongWritable,org.apache.hadoop.io.Text>
Throws:
java.io.IOException

getSplits

public org.apache.hadoop.mapred.InputSplit[] getSplits(org.apache.hadoop.mapred.JobConf job,
                                                       int numSplits)
                                                throws java.io.IOException
Retrieves path information from FileInputFormat super class and then relation from path using the job configuration. Obtains the chunk locations from the HadoopDB catalog for the given relation and then creates a split for each chunk of a relation. The splits are provided with path, chunk and relation.

Specified by:
getSplits in interface org.apache.hadoop.mapred.InputFormat<org.apache.hadoop.io.LongWritable,org.apache.hadoop.io.Text>
Overrides:
getSplits in class org.apache.hadoop.mapred.FileInputFormat<org.apache.hadoop.io.LongWritable,org.apache.hadoop.io.Text>
Throws:
java.io.IOException