|
|||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Objectorg.apache.nutch.tools.UpdateDatabaseTool
This class takes the output of the fetcher and updates the page and link DBs accordingly. Eventually, as the database scales, this will broken into several phases, each consuming and emitting batch files, but, for now, we're doing it all here.
Field Summary | |
static boolean |
IGNORE_INTERNAL_LINKS
|
static Logger |
LOG
|
static int |
MAX_OUTLINKS_PER_PAGE
|
static float |
NEW_EXTERNAL_LINK_FACTOR
|
static float |
NEW_INTERNAL_LINK_FACTOR
|
Constructor Summary | |
UpdateDatabaseTool(IWebDBWriter webdb,
boolean additionsAllowed,
int maxCount)
Take in the WebDBWriter, instantiated elsewhere. |
Method Summary | |
void |
close()
Shut everything down. |
static void |
main(String[] args)
Create the UpdateDatabaseTool, and pass in a WebDBWriter. |
void |
updateForSegment(NutchFileSystem nfs,
String directory)
Iterate through items in the FetcherOutput. |
Methods inherited from class java.lang.Object |
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
Field Detail |
public static final float NEW_INTERNAL_LINK_FACTOR
public static final float NEW_EXTERNAL_LINK_FACTOR
public static final int MAX_OUTLINKS_PER_PAGE
public static final boolean IGNORE_INTERNAL_LINKS
public static final Logger LOG
Constructor Detail |
public UpdateDatabaseTool(IWebDBWriter webdb, boolean additionsAllowed, int maxCount)
Method Detail |
public void updateForSegment(NutchFileSystem nfs, String directory) throws IOException
IOException
public void close() throws IOException
IOException
public static void main(String[] args) throws Exception
Exception
|
|||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |