edu.yale.cs.hadoopdb.connector
Class DBInputSplit

java.lang.Object
  extended by edu.yale.cs.hadoopdb.connector.DBInputSplit
All Implemented Interfaces:
org.apache.hadoop.io.Writable, org.apache.hadoop.mapred.InputSplit
Direct Known Subclasses:
SMSInputSplit

public class DBInputSplit
extends java.lang.Object
implements org.apache.hadoop.mapred.InputSplit

DBInputSplit links each Map to a DB Chunk. Splits serialize their DB connection information so that when instantiated at Map nodes, they can connect to their respective DB Chunks.


Field Summary
protected  DBChunk chunk
           
protected  java.lang.String[] locations
           
static org.apache.commons.logging.Log LOG
           
protected  java.lang.String relation
           
 
Constructor Summary
DBInputSplit()
           
 
Method Summary
private  DBChunk deserializeChunk(java.io.DataInput in)
          Deserializes DBChunk
 DBChunk getChunk()
           
 long getLength()
          Returns 1 now...
 java.lang.String[] getLocations()
          Returns locations (host addresses) of the chunk's hosts.
 java.lang.String getRelation()
           
 void readFields(java.io.DataInput in)
          Deserializes relation and DBChunk object.
private  void serializeChunk(DBChunk chunk, java.io.DataOutput out)
          Serializes DBChunk
 void setChunk(DBChunk chunk)
          Sets a DBChunk and updates split locations
private  void setLocations()
          This method is called by readFields or setChunk on split instantiation or creation.
 void setRelation(java.lang.String relation)
           
 void write(java.io.DataOutput out)
          Serializes the relation and Chunk object.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

chunk

protected DBChunk chunk

locations

protected java.lang.String[] locations

LOG

public static final org.apache.commons.logging.Log LOG

relation

protected java.lang.String relation
Constructor Detail

DBInputSplit

public DBInputSplit()
Method Detail

deserializeChunk

private DBChunk deserializeChunk(java.io.DataInput in)
                          throws java.io.IOException
Deserializes DBChunk

Throws:
java.io.IOException

getChunk

public DBChunk getChunk()

getLength

public long getLength()
               throws java.io.IOException
Returns 1 now... could in the future return information about the size of the Chunk (e.g. number of rows)

Specified by:
getLength in interface org.apache.hadoop.mapred.InputSplit
Throws:
java.io.IOException

getLocations

public java.lang.String[] getLocations()
                                throws java.io.IOException
Returns locations (host addresses) of the chunk's hosts.

Specified by:
getLocations in interface org.apache.hadoop.mapred.InputSplit
Throws:
java.io.IOException

getRelation

public java.lang.String getRelation()

readFields

public void readFields(java.io.DataInput in)
                throws java.io.IOException
Deserializes relation and DBChunk object. Then creates the list of locations from the DBChunk object.

Specified by:
readFields in interface org.apache.hadoop.io.Writable
Throws:
java.io.IOException

serializeChunk

private void serializeChunk(DBChunk chunk,
                            java.io.DataOutput out)
                     throws java.io.IOException
Serializes DBChunk

Throws:
java.io.IOException

setChunk

public void setChunk(DBChunk chunk)
Sets a DBChunk and updates split locations


setLocations

private void setLocations()
This method is called by readFields or setChunk on split instantiation or creation. A chunk could be stored on one or more locations. The locations array is therefore populated with the different host locations of the split's chunk.


setRelation

public void setRelation(java.lang.String relation)

write

public void write(java.io.DataOutput out)
           throws java.io.IOException
Serializes the relation and Chunk object.

Specified by:
write in interface org.apache.hadoop.io.Writable
Throws:
java.io.IOException