|
|||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Objectorg.apache.nutch.db.WebDBInjector
This class takes a flat file of URLs and adds them as entries into a pagedb. Useful for bootstrapping the system.
Field Summary | |
static Logger |
LOG
|
Constructor Summary | |
WebDBInjector(IWebDBWriter dbWriter)
WebDBInjector takes a reference to a WebDBWriter that it should add to. |
Method Summary | |
boolean |
addPage(String url)
Add one page to WebDB. |
void |
close()
Close dbWriter and save changes |
void |
injectDmozFile(File dmozFile,
int subsetDenom,
boolean includeAdult,
boolean includeDmozDesc,
int skew,
Pattern topicPattern)
Iterate through all the items in this structured DMOZ file. |
void |
injectURLFile(File urlList)
Iterate through all the items in this flat text file and add them to the db. |
static void |
main(String[] argv)
Command-line access. |
void |
printStatus()
Utility to present performance stats |
void |
printStatusBar(int small,
int big)
Utility to present small status bar |
Methods inherited from class java.lang.Object |
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
Field Detail |
public static final Logger LOG
Constructor Detail |
public WebDBInjector(IWebDBWriter dbWriter)
Method Detail |
public void close() throws IOException
IOException
public void printStatusBar(int small, int big)
public void printStatus()
public void injectURLFile(File urlList) throws IOException
IOException
public void injectDmozFile(File dmozFile, int subsetDenom, boolean includeAdult, boolean includeDmozDesc, int skew, Pattern topicPattern) throws IOException, SAXException, ParserConfigurationException
IOException
SAXException
ParserConfigurationException
public boolean addPage(String url) throws IOException
close()
is invoked. URLs are checked with the
URLFilter, and only those that pass are added.
url
- URL to be added
IOException
public static void main(String[] argv) throws Exception
Exception
|
|||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |