org.apache.nutch.searcher
Class NutchBean

java.lang.Object
  extended byorg.apache.nutch.searcher.NutchBean
All Implemented Interfaces:
DistributedSearch.Protocol, HitContent, HitDetailer, HitSummarizer, Searcher

public class NutchBean
extends Object
implements Searcher, HitDetailer, HitSummarizer, HitContent, DistributedSearch.Protocol

One stop shopping for search-related functionality.

Version:
$Id: NutchBean.java,v 1.19 2005/02/07 19:10:08 cutting Exp $

Field Summary
static Logger LOG
           
 
Constructor Summary
NutchBean()
          Construct reading from connected directory.
NutchBean(File dir)
          Construct in a named directory.
 
Method Summary
static NutchBean get(javax.servlet.ServletContext app)
          Cache in servlet context.
 String[] getAnchors(HitDetails hit)
          Returns the anchors of a hit document.
 byte[] getContent(HitDetails hit)
          Returns the content of a hit document.
 HitDetails getDetails(Hit hit)
          Returns the details for a hit document.
 HitDetails[] getDetails(Hit[] hits)
          Returns the details for a set of hits.
 String getExplanation(Query query, Hit hit)
          Return an HTML-formatted explanation of how a query scored.
 long getFetchDate(HitDetails hit)
          Returns the anchors of a hit document.
 ParseData getParseData(HitDetails hit)
          Returns the ParseData of a hit document.
 ParseText getParseText(HitDetails hit)
          Returns the ParseText of a hit document.
 String[] getSegmentNames()
          The name of the segments searched by this node.
 String[] getSummary(HitDetails[] hits, Query query)
          Returns summaries for a set of details.
 String getSummary(HitDetails hit, Query query)
          Returns a summary for the given hit details.
static void main(String[] args)
          For debugging.
 Hits search(Query query, int numHits)
           
 Hits search(Query query, int numHits, int maxHitsPerDup)
          Search for pages matching a query, eliminating excessive hits from the same site.
 Hits search(Query query, int numHits, int maxHitsPerDup, String dedupField)
          Search for pages matching a query, eliminating excessive hits with matching values for a named field.
 Hits search(Query query, int numHits, int maxHitsPerDup, String dedupField, String sortField, boolean reverse)
          Search for pages matching a query, eliminating excessive hits with matching values for a named field.
 Hits search(Query query, int numHits, String dedupField, String sortField, boolean reverse)
          Return the top-scoring hits for a query.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

LOG

public static final Logger LOG
Constructor Detail

NutchBean

public NutchBean()
          throws IOException
Construct reading from connected directory.


NutchBean

public NutchBean(File dir)
          throws IOException
Construct in a named directory.

Method Detail

get

public static NutchBean get(javax.servlet.ServletContext app)
                     throws IOException
Cache in servlet context.

Throws:
IOException

getSegmentNames

public String[] getSegmentNames()
Description copied from interface: DistributedSearch.Protocol
The name of the segments searched by this node.

Specified by:
getSegmentNames in interface DistributedSearch.Protocol

search

public Hits search(Query query,
                   int numHits)
            throws IOException
Throws:
IOException

search

public Hits search(Query query,
                   int numHits,
                   String dedupField,
                   String sortField,
                   boolean reverse)
            throws IOException
Description copied from interface: Searcher
Return the top-scoring hits for a query.

Specified by:
search in interface Searcher
Throws:
IOException

search

public Hits search(Query query,
                   int numHits,
                   int maxHitsPerDup)
            throws IOException
Search for pages matching a query, eliminating excessive hits from the same site. Hits after the first maxHitsPerDup from the same site are removed from results. The remaining hits have Hit.moreFromDupExcluded() set.

If maxHitsPerDup is zero then all hits are returned.

Parameters:
query - query
numHits - number of requested hits
maxHitsPerDup - the maximum hits returned with matching values, or zero
Returns:
Hits the matching hits
Throws:
IOException

search

public Hits search(Query query,
                   int numHits,
                   int maxHitsPerDup,
                   String dedupField)
            throws IOException
Search for pages matching a query, eliminating excessive hits with matching values for a named field. Hits after the first maxHitsPerDup are removed from results. The remaining hits have Hit.moreFromDupExcluded() set.

If maxHitsPerDup is zero then all hits are returned.

Parameters:
query - query
numHits - number of requested hits
maxHitsPerDup - the maximum hits returned with matching values, or zero
dedupField - field name to check for duplicates
Returns:
Hits the matching hits
Throws:
IOException

search

public Hits search(Query query,
                   int numHits,
                   int maxHitsPerDup,
                   String dedupField,
                   String sortField,
                   boolean reverse)
            throws IOException
Search for pages matching a query, eliminating excessive hits with matching values for a named field. Hits after the first maxHitsPerDup are removed from results. The remaining hits have Hit.moreFromDupExcluded() set.

If maxHitsPerDup is zero then all hits are returned.

Parameters:
query - query
numHits - number of requested hits
maxHitsPerDup - the maximum hits returned with matching values, or zero
dedupField - field name to check for duplicates
sortField - Field to sort on (or null if no sorting).
reverse - True if we are to reverse sort by sortField.
Returns:
Hits the matching hits
Throws:
IOException

getExplanation

public String getExplanation(Query query,
                             Hit hit)
                      throws IOException
Description copied from interface: Searcher
Return an HTML-formatted explanation of how a query scored.

Specified by:
getExplanation in interface Searcher
Throws:
IOException

getDetails

public HitDetails getDetails(Hit hit)
                      throws IOException
Description copied from interface: HitDetailer
Returns the details for a hit document.

Specified by:
getDetails in interface HitDetailer
Throws:
IOException

getDetails

public HitDetails[] getDetails(Hit[] hits)
                        throws IOException
Description copied from interface: HitDetailer
Returns the details for a set of hits. Hook for parallel IPC calls.

Specified by:
getDetails in interface HitDetailer
Throws:
IOException

getSummary

public String getSummary(HitDetails hit,
                         Query query)
                  throws IOException
Description copied from interface: HitSummarizer
Returns a summary for the given hit details.

Specified by:
getSummary in interface HitSummarizer
Parameters:
hit - the details of the hit to be summarized
query - indicates what should be higlighted in the summary text
Throws:
IOException

getSummary

public String[] getSummary(HitDetails[] hits,
                           Query query)
                    throws IOException
Description copied from interface: HitSummarizer
Returns summaries for a set of details. Hook for parallel IPC calls.

Specified by:
getSummary in interface HitSummarizer
Parameters:
hits - the details of hits to be summarized
query - indicates what should be higlighted in the summary text
Throws:
IOException

getContent

public byte[] getContent(HitDetails hit)
                  throws IOException
Description copied from interface: HitContent
Returns the content of a hit document.

Specified by:
getContent in interface HitContent
Throws:
IOException

getParseData

public ParseData getParseData(HitDetails hit)
                       throws IOException
Description copied from interface: HitContent
Returns the ParseData of a hit document.

Specified by:
getParseData in interface HitContent
Throws:
IOException

getParseText

public ParseText getParseText(HitDetails hit)
                       throws IOException
Description copied from interface: HitContent
Returns the ParseText of a hit document.

Specified by:
getParseText in interface HitContent
Throws:
IOException

getAnchors

public String[] getAnchors(HitDetails hit)
                    throws IOException
Description copied from interface: HitContent
Returns the anchors of a hit document.

Specified by:
getAnchors in interface HitContent
Throws:
IOException

getFetchDate

public long getFetchDate(HitDetails hit)
                  throws IOException
Description copied from interface: HitContent
Returns the anchors of a hit document.

Specified by:
getFetchDate in interface HitContent
Throws:
IOException

main

public static void main(String[] args)
                 throws Exception
For debugging.

Throws:
Exception


Copyright © 2006 The Apache Software Foundation