|
|||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Object org.apache.nutch.db.WebDBReader
The WebDBReader implements all the read-only parts of accessing our web database. All the writing ones can be found in WebDBWriter.
Constructor Summary | |
WebDBReader(NutchFileSystem nfs,
File dbDir)
Open a web db reader for the named directory. |
Method Summary | |
void |
close()
Shutdown |
Link[] |
getLinks(MD5Hash md5)
Grab all the links from the given MD5 hash. |
Link[] |
getLinks(UTF8 url)
Get all the hyperlinks that link TO the indicated URL. |
Page |
getPage(String url)
Get Page from the pagedb with the given URL |
Page[] |
getPages(MD5Hash md5)
Get Pages from the pagedb according to their content hash. |
Enumeration |
links()
Return all the links, by target URL |
static void |
main(String[] argv)
The WebDBReader.main() provides some handy utility methods for looking through the contents of the webdb. |
long |
numLinks()
Return the number of links in our db. |
long |
numPages()
Return the number of pages we're dealing with |
boolean |
pageExists(MD5Hash md5)
Test whether a certain piece of content is in the database, but don't bother returning the Page(s) itself. |
Enumeration |
pages()
Iterate through all the Pages, sorted by URL |
Enumeration |
pagesByMD5()
Iterate through all the Pages, sorted by MD5 |
Methods inherited from class java.lang.Object |
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
Constructor Detail |
public WebDBReader(NutchFileSystem nfs, File dbDir) throws IOException, FileNotFoundException
Method Detail |
public void close() throws IOException
close
in interface IWebDBReader
IOException
public Page getPage(String url) throws IOException
getPage
in interface IWebDBReader
IOException
public Page[] getPages(MD5Hash md5) throws IOException
getPages
in interface IWebDBReader
IOException
public boolean pageExists(MD5Hash md5) throws IOException
pageExists
in interface IWebDBReader
IOException
public Enumeration pages() throws IOException
pages
in interface IWebDBReader
IOException
public Enumeration pagesByMD5() throws IOException
pagesByMD5
in interface IWebDBReader
IOException
public long numPages()
numPages
in interface IWebDBReader
public Link[] getLinks(UTF8 url) throws IOException
getLinks
in interface IWebDBReader
IOException
public Link[] getLinks(MD5Hash md5) throws IOException
getLinks
in interface IWebDBReader
IOException
public Enumeration links()
links
in interface IWebDBReader
public long numLinks()
numLinks
in interface IWebDBReader
public static void main(String[] argv) throws FileNotFoundException, IOException
FileNotFoundException
IOException
|
|||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |