org.apache.nutch.segment
Class SegmentWriter

java.lang.Object
  extended byorg.apache.nutch.segment.SegmentWriter

public class SegmentWriter
extends Object

This class holds together all data writers for a new segment. Some convenience methods are also provided, to append to the segment.

Author:
Andrzej Bialecki <[email protected]>

Field Summary
 ArrayFile.Writer contentWriter
           
 ArrayFile.Writer fetcherWriter
           
static Logger LOG
           
 ArrayFile.Writer parseDataWriter
           
 ArrayFile.Writer parseTextWriter
           
 File segmentDir
           
 long size
           
 
Constructor Summary
SegmentWriter(File dir, boolean force)
           
SegmentWriter(File dir, boolean force, boolean isParsed)
           
SegmentWriter(NutchFileSystem nfs, File dir, boolean force)
           
SegmentWriter(NutchFileSystem nfs, File dir, boolean force, boolean isParsed)
           
SegmentWriter(NutchFileSystem nfs, File dir, boolean force, boolean isParsed, boolean withContent, boolean withParseText, boolean withParseData)
          Open a segment for writing.
 
Method Summary
 void append(FetcherOutput fo, Content co, ParseText pt, ParseData pd)
          Append new values to the output segment.
 void close()
          Close all writers.
static String getNewSegmentName()
          Create a new segment name
static void main(String[] args)
           
 void setIndexInterval(int interval)
          Sets the index interval for all segment writers.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

LOG

public static final Logger LOG

fetcherWriter

public ArrayFile.Writer fetcherWriter

contentWriter

public ArrayFile.Writer contentWriter

parseTextWriter

public ArrayFile.Writer parseTextWriter

parseDataWriter

public ArrayFile.Writer parseDataWriter

size

public long size

segmentDir

public File segmentDir
Constructor Detail

SegmentWriter

public SegmentWriter(File dir,
                     boolean force)
              throws Exception

SegmentWriter

public SegmentWriter(NutchFileSystem nfs,
                     File dir,
                     boolean force)
              throws Exception

SegmentWriter

public SegmentWriter(File dir,
                     boolean force,
                     boolean isParsed)
              throws Exception

SegmentWriter

public SegmentWriter(NutchFileSystem nfs,
                     File dir,
                     boolean force,
                     boolean isParsed)
              throws Exception

SegmentWriter

public SegmentWriter(NutchFileSystem nfs,
                     File dir,
                     boolean force,
                     boolean isParsed,
                     boolean withContent,
                     boolean withParseText,
                     boolean withParseData)
              throws Exception
Open a segment for writing. When a segment is open, its data files are created.

Parameters:
nfs - NutchFileSystem to use
dir - directory to contain the segment data
force - if true, and segment directory already exists and its content is in the way, sliently overwrite that content as needed. If false and the above condition arises, throw an Exception. Note: this doesn't result in an Exception, if force=false, and the target directory already exists, but contains other data not conflicting with the segment data.
isParsed - if true, create a segment with parseData and parseText; otherwise create a segment without them, and with the fetcher output located in FetcherOutput.DIR_NAME_NP directory.
withContent - if true, write Content, otherwise ignore it
withParseText - if true, write ParseText, otherwise ignore it. NOTE: if isParsed is false, this will be automaticaly set to false, too.
withParseData - if true, write ParseData, otherwise ignore it. NOTE: if isParsed is false, this will be automaticaly set to false, too.
Throws:
Exception
Method Detail

getNewSegmentName

public static String getNewSegmentName()
Create a new segment name


setIndexInterval

public void setIndexInterval(int interval)
                      throws IOException
Sets the index interval for all segment writers.

Throws:
IOException

append

public void append(FetcherOutput fo,
                   Content co,
                   ParseText pt,
                   ParseData pd)
            throws IOException
Append new values to the output segment.

NOTE: if this segment writer has some data files open, but the respective arguments are null, empty values will be written instead.

Parameters:
fo - fetcher output, must not be null
co - content, may be null (but see the note above)
pt - parseText, may be null (but see the note above)
pd - parseData, may be null (but see the note above)
Throws:
IOException

close

public void close()
Close all writers.


main

public static void main(String[] args)


Copyright © 2006 The Apache Software Foundation