org.apache.nutch.mapReduce
Class InputFormatBase

java.lang.Object
  extended byorg.apache.nutch.mapReduce.InputFormatBase
All Implemented Interfaces:
InputFormat
Direct Known Subclasses:
SequenceFileInputFormat, TextInputFormat

public abstract class InputFormatBase
extends Object
implements InputFormat

A base class for InputFormat.


Constructor Summary
InputFormatBase()
           
 
Method Summary
abstract  String getName()
          The name of this input format.
abstract  RecordReader getRecordReader(NutchFileSystem fs, FileSplit split, JobConf job)
          Construct a RecordReader for a FileSplit.
 FileSplit[] getSplits(NutchFileSystem fs, JobConf job, int numSplits)
          Splits files returned by {#listFiles(NutchFileSystem,JobConf) when they're too big.
protected  File[] listFiles(NutchFileSystem fs, JobConf job)
          Subclasses may override to, e.g., select only files matching a regular expression.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

InputFormatBase

public InputFormatBase()
Method Detail

getName

public abstract String getName()
Description copied from interface: InputFormat
The name of this input format.

Specified by:
getName in interface InputFormat
See Also:
InputFormats

getRecordReader

public abstract RecordReader getRecordReader(NutchFileSystem fs,
                                             FileSplit split,
                                             JobConf job)
                                      throws IOException
Description copied from interface: InputFormat
Construct a RecordReader for a FileSplit.

Specified by:
getRecordReader in interface InputFormat
Parameters:
fs - the NutchFileSystem
split - the FileSplit
job - the job that this split belongs to
Returns:
a RecordReader
Throws:
IOException

listFiles

protected File[] listFiles(NutchFileSystem fs,
                           JobConf job)
                    throws IOException
Subclasses may override to, e.g., select only files matching a regular expression.

Throws:
IOException

getSplits

public FileSplit[] getSplits(NutchFileSystem fs,
                             JobConf job,
                             int numSplits)
                      throws IOException
Splits files returned by {#listFiles(NutchFileSystem,JobConf) when they're too big.

Specified by:
getSplits in interface InputFormat
Parameters:
fs - the filesystem containing the files to be split
job - the job whose input files are to be split
numSplits - the desired number of splits
Returns:
the splits
Throws:
IOException


Copyright © 2006 The Apache Software Foundation