|
|||||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | ||||||||||
An input data format. Input files are stored in a NutchFileSystem.
The processing of an input file may be split across multiple machines.
Files are processed as sequences of records, implementing RecordReader. Files must thus be split on record boundaries.
| Method Summary | |
String |
getName()
The name of this input format. |
RecordReader |
getRecordReader(NutchFileSystem fs,
FileSplit split,
JobConf job)
Construct a RecordReader for a FileSplit. |
FileSplit[] |
getSplits(NutchFileSystem fs,
JobConf job,
int numSplits)
Splits a set of input files. |
| Method Detail |
public String getName()
InputFormats
public FileSplit[] getSplits(NutchFileSystem fs,
JobConf job,
int numSplits)
throws IOException
fs - the filesystem containing the files to be splitjob - the job whose input files are to be splitnumSplits - the desired number of splits
IOException
public RecordReader getRecordReader(NutchFileSystem fs,
FileSplit split,
JobConf job)
throws IOException
RecordReader for a FileSplit.
fs - the NutchFileSystemsplit - the FileSplitjob - the job that this split belongs to
RecordReader
IOException
|
|||||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | ||||||||||