Package org.apache.nutch.mapReduce

A system for scalable, fault-tolerant, distributed computation over large data collections.

See:
          Description

Interface Summary
Configurable That what may be configured.
InputFormat An input data format.
InterTrackerProtocol Protocol that a TaskTracker and the central JobTracker use to communicate.
JobSubmissionProtocol Protocol that a JobClient and the central JobTracker use to communicate.
MapOutputProtocol Protocol that a reduce task uses to retrieve output data from a map task's tracker.
Mapper Maps input key/value pairs to a set of intermediate key/value pairs.
MRConstants Some handy constants
OutputCollector Passed to Mapper and Reducer implementations to collect output data.
OutputFormat An output data format.
Partitioner Partitions the key space.
RecordReader Reads key/value pairs from an input file FileSplit.
RecordWriter Writes key/value pairs to an output file.
Reducer Reduces a set of intermediate values which share a key to a smaller set of values.
RunningJob Includes details on a running MapReduce job.
TaskUmbilicalProtocol Protocol that task child process uses to contact its parent process.
 

Class Summary
FileSplit A section of an input file.
InputFormatBase A base class for InputFormat.
InputFormats Repository of named InputFormats.
JobClient JobClient interacts with the JobTracker network interface.
JobConf A map/reduce job configuration.
JobProfile A JobProfile is a MapReduce primitive.
JobStatus Describes the current status of a job.
JobTracker JobTracker is the central location for submitting and tracking MR jobs in a network environment.
JobTrackerInfoServer JobTrackerInfoServer provides stats about the JobTracker via HTTP.
JobTrackerInfoServer.RedirectHandler  
MapOutputFile A local file to be transferred via the MapOutputProtocol.
MapOutputLocation The location of a map output file, as passed to a reduce task via the InterTrackerProtocol.
MapTask A Map task.
OutputFormats Repository of named OutputFormats.
ReduceTask A Reduce task.
SequenceFileInputFormat An InputFormat for plain text files.
SequenceFileOutputFormat  
Task Base class for tasks.
TaskStatus Describes the current status of a task.
TaskTracker TaskTracker is a process that starts and tracks MR Tasks in a networked environment.
TaskTracker.Child The main() for child processes.
TaskTrackerStatus A TaskTrackerStatus is a MapReduce primitive.
TextInputFormat An InputFormat for plain text files.
TextOutputFormat  
 

Package org.apache.nutch.mapReduce Description

A system for scalable, fault-tolerant, distributed computation over large data collections.

Applications implement Mapper and Reducer interfaces. These are submitted as a MapReduceJob and are applied to data stored in a NutchFileSystem.

See Google's original Map/Reduce paper for background information.



Copyright © 2006 The Apache Software Foundation