org.apache.nutch.db
Class DistributedWebDBWriter

java.lang.Object
  extended byorg.apache.nutch.db.DistributedWebDBWriter
All Implemented Interfaces:
IWebDBWriter

public class DistributedWebDBWriter
extends Object
implements IWebDBWriter

This is a wrapper class that allows us to reorder write operations to the linkdb and pagedb. It is useful only for objects like UpdateDatabaseTool, which just does writes. The WebDBWriter is a traditional single-pass database writer. It does not cache any instructions to disk (but it does in memory, with possible resorting). It certainly does nothing in a distributed fashion. There are other implementors of IWebDBWriter that do all that fancy stuff.

Author:
Mike Cafarella

Nested Class Summary
static class DistributedWebDBWriter.LinkInstruction
          Holds an instruction over a Link.
static class DistributedWebDBWriter.LinkInstructionWriter
          LinkInstructionWriter very efficiently writes a LinkInstruction to an EditSectionGroupWriter.
static class DistributedWebDBWriter.PageInstruction
          PageInstruction holds an operation over a Page.
static class DistributedWebDBWriter.PageInstructionWriter
          PageInstructionWriter very efficiently writes a PageInstruction to an EditSectionGroupWriter.
 
Constructor Summary
DistributedWebDBWriter(NutchFileSystem nfs, File root, int machineNum)
          Open the db files.
 
Method Summary
 void addLink(Link lr)
          Add a link to the link database
 void addPage(Page page)
          Add a page to the page database
 void addPageIfNotPresent(Page page)
          Don't replace the one in the database, if there is one.
 void addPageIfNotPresent(Page page, Link link)
          Don't replace the one in the database, if there is one.
 void addPageWithScore(Page page)
          Add a page to the page database, with a brand-new score
 void close()
          Shutdown
static void createDB(NutchFileSystem nfs, File root, int totalMachines)
          Method useful for the first time we create a distributed db project.
 void deletePage(String url)
          Remove a page from the page database.
static void main(String[] argv)
          The WebDBWriter.main() provides some handy methods for testing the WebDB.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

DistributedWebDBWriter

public DistributedWebDBWriter(NutchFileSystem nfs,
                              File root,
                              int machineNum)
                       throws IOException
Open the db files.

Method Detail

createDB

public static void createDB(NutchFileSystem nfs,
                            File root,
                            int totalMachines)
                     throws IOException
Method useful for the first time we create a distributed db project. Basically need to write down the number of dirs we can expect.

Throws:
IOException

close

public void close()
           throws IOException
Shutdown

Specified by:
close in interface IWebDBWriter
Throws:
IOException

addPage

public void addPage(Page page)
             throws IOException
Add a page to the page database

Specified by:
addPage in interface IWebDBWriter
Throws:
IOException

addPageWithScore

public void addPageWithScore(Page page)
                      throws IOException
Add a page to the page database, with a brand-new score

Specified by:
addPageWithScore in interface IWebDBWriter
Throws:
IOException

addPageIfNotPresent

public void addPageIfNotPresent(Page page)
                         throws IOException
Don't replace the one in the database, if there is one.

Specified by:
addPageIfNotPresent in interface IWebDBWriter
Throws:
IOException

addPageIfNotPresent

public void addPageIfNotPresent(Page page,
                                Link link)
                         throws IOException
Don't replace the one in the database, if there is one. If we do insert the new Page, then we should also insert the given Link object.

Specified by:
addPageIfNotPresent in interface IWebDBWriter
Throws:
IOException

deletePage

public void deletePage(String url)
                throws IOException
Remove a page from the page database.

Specified by:
deletePage in interface IWebDBWriter
Throws:
IOException

addLink

public void addLink(Link lr)
             throws IOException
Add a link to the link database

Specified by:
addLink in interface IWebDBWriter
Throws:
IOException

main

public static void main(String[] argv)
                 throws FileNotFoundException,
                        IOException
The WebDBWriter.main() provides some handy methods for testing the WebDB.

Throws:
FileNotFoundException
IOException


Copyright © 2006 The Apache Software Foundation