org.apache.nutch.tools
Class FetchListTool

java.lang.Object
  extended byorg.apache.nutch.tools.FetchListTool

public class FetchListTool
extends Object

This class takes an IWebDBReader, computes a relevant subset, and then emits the subset.

Author:
Mike Cafarella

Nested Class Summary
static class FetchListTool.SortableScore
          SortableScore is just a WritableComparable Float!
 
Field Summary
static Logger LOG
           
 
Constructor Summary
FetchListTool(NutchFileSystem nfs, File dbDir, boolean refetchOnly, float cutoffScore, int seed)
          FetchListTool takes a page db, and emits a RECNO-based subset of it.
 
Method Summary
 void emitFetchList(File segmentDir, long topN, long curTime)
          Spit out the fetchlist, to a BDB at the indicated filename.
 void emitMultipleLists(File dir, int numLists, long topN, long curTime)
          Spit out several fetchlists, so that we can fetch across several machines.
static void main(String[] argv)
          Generate a fetchlist from the pagedb and linkdb
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

LOG

public static final Logger LOG
Constructor Detail

FetchListTool

public FetchListTool(NutchFileSystem nfs,
                     File dbDir,
                     boolean refetchOnly,
                     float cutoffScore,
                     int seed)
              throws IOException,
                     FileNotFoundException
FetchListTool takes a page db, and emits a RECNO-based subset of it.

Method Detail

emitMultipleLists

public void emitMultipleLists(File dir,
                              int numLists,
                              long topN,
                              long curTime)
                       throws IOException
Spit out several fetchlists, so that we can fetch across several machines.

Throws:
IOException

emitFetchList

public void emitFetchList(File segmentDir,
                          long topN,
                          long curTime)
                   throws IOException
Spit out the fetchlist, to a BDB at the indicated filename.

Throws:
IOException

main

public static void main(String[] argv)
                 throws IOException,
                        FileNotFoundException
Generate a fetchlist from the pagedb and linkdb

Throws:
IOException
FileNotFoundException


Copyright © 2006 The Apache Software Foundation