org.apache.nutch.db
Class Link

java.lang.Object
  extended byorg.apache.nutch.db.Link
All Implemented Interfaces:
Comparable, Writable, WritableComparable

public class Link
extends Object
implements WritableComparable

This is the field in the Link Database.

 Each row is a Link:
   type   name    description
 ---------------------------------------------------------------
 byte   VERSION - A byte indicating the version of this entry.
 128bit FROM_ID - The MD5 hash of the source of the link.
 64bit  DOMAIN_ID - The 8-byte MD5Hash of the source's domain.
 string TO_URL  - The URL destination of the link.
 string ANCHOR  - The anchor text of the link.
 boolean TARGET_HAS_OUTLINK   - Whether the target of the link has outlinks.
 

Author:
Mike Cafarella

Nested Class Summary
static class Link.MD5Comparator
          MD5Comparator is the opposite.
static class Link.UrlComparator
          URLComparator uses the standard method where, uh, the URL comes first.
 
Field Summary
static int MAX_ANCHOR_LENGTH
           
 
Constructor Summary
Link()
          Create the Link with no data
Link(MD5Hash fromID, long domainID, String urlString, String anchorText)
          Create the record
 
Method Summary
 int compareTo(Object o)
           
 UTF8 getAnchorText()
           
 long getDomainID()
           
 MD5Hash getFromID()
           
 UTF8 getURL()
           
 int md5Compare(Object o)
          Compare MD5s, then compare URLs.
static Link read(DataInput in)
           
 void readFields(DataInput in)
          Read in fields from a bytestream
 void set(Link that)
           
 void setTargetHasOutlink(boolean targetHasOutlink)
           
 boolean targetHasOutlink()
           
 String toString()
          Print out the record
 String toTabbedString()
          Get a tab-delimited version of the text data.
 int urlCompare(Object o)
          Compare URLs, then compare MD5s.
 void write(DataOutput out)
          Write bytes out to stream
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait
 

Field Detail

MAX_ANCHOR_LENGTH

public static final int MAX_ANCHOR_LENGTH
Constructor Detail

Link

public Link()
Create the Link with no data


Link

public Link(MD5Hash fromID,
            long domainID,
            String urlString,
            String anchorText)
     throws MalformedURLException
Create the record

Method Detail

readFields

public void readFields(DataInput in)
                throws IOException
Read in fields from a bytestream

Specified by:
readFields in interface Writable
Throws:
IOException

set

public void set(Link that)

write

public void write(DataOutput out)
           throws IOException
Write bytes out to stream

Specified by:
write in interface Writable
Throws:
IOException

read

public static Link read(DataInput in)
                 throws IOException
Throws:
IOException

getFromID

public MD5Hash getFromID()

getURL

public UTF8 getURL()

getDomainID

public long getDomainID()

getAnchorText

public UTF8 getAnchorText()

targetHasOutlink

public boolean targetHasOutlink()

setTargetHasOutlink

public void setTargetHasOutlink(boolean targetHasOutlink)

toString

public String toString()
Print out the record


toTabbedString

public String toTabbedString()
Get a tab-delimited version of the text data.


compareTo

public int compareTo(Object o)
Specified by:
compareTo in interface Comparable

urlCompare

public int urlCompare(Object o)
Compare URLs, then compare MD5s.


md5Compare

public int md5Compare(Object o)
Compare MD5s, then compare URLs.



Copyright © 2006 The Apache Software Foundation