Xapian::Enquire Class Reference

This class provides an interface to the information retrieval system for the purpose of searching. More...

#include <enquire.h>

List of all members.

Public Types

enum  docid_order { ASCENDING = 1, DESCENDING = 0, DONT_CARE = 2 }

Public Member Functions

 Enquire (const Enquire &other)
 Copying is allowed (and is cheap).
void operator= (const Enquire &other)
 Assignment is allowed (and is cheap).
 Enquire (const Database &database, ErrorHandler *errorhandler_=0)
 Create a Xapian::Enquire object.
 ~Enquire ()
 Close the Xapian::Enquire object.
void set_query (const Xapian::Query &query, Xapian::termcount qlen=0)
 Set the query to run.
const Xapian::Queryget_query () const
 Get the query which has been set.
void set_weighting_scheme (const Weight &weight_)
 Set the weighting scheme to use for queries.
void set_collapse_key (Xapian::valueno collapse_key)
 Set the collapse key to use for queries.
void set_docid_order (docid_order order)
 Set the direction in which documents are ordered by document id in the returned MSet.
void set_cutoff (Xapian::percent percent_cutoff, Xapian::weight weight_cutoff=0)
 Set the percentage and/or weight cutoffs.
void set_sort_by_relevance ()
 Set the sorting to be by relevance only.
void set_sort_by_value (Xapian::valueno sort_key, bool ascending=true)
 Set the sorting to be by value only.
void set_sort_by_key (Xapian::Sorter *sorter, bool ascending=true)
 Set the sorting to be by key generated from values only.
void set_sort_by_value_then_relevance (Xapian::valueno sort_key, bool ascending=true)
 Set the sorting to be by value, then by relevance for documents with the same value.
void set_sort_by_key_then_relevance (Xapian::Sorter *sorter, bool ascending=true)
 Set the sorting to be by keys generated from values, then by relevance for documents with identical keys.
void set_sort_by_relevance_then_value (Xapian::valueno sort_key, bool ascending=true)
 Set the sorting to be by relevance then value.
void set_sort_by_relevance_then_key (Xapian::Sorter *sorter, bool ascending=true)
 Set the sorting to be by relevance, then by keys generated from values.
MSet get_mset (Xapian::doccount first, Xapian::doccount maxitems, Xapian::doccount checkatleast=0, const RSet *omrset=0, const MatchDecider *mdecider=0) const
 Get (a portion of) the match set for the current query.
MSet get_mset (Xapian::doccount first, Xapian::doccount maxitems, Xapian::doccount checkatleast, const RSet *omrset, const MatchDecider *mdecider, const MatchDecider *matchspy) const
MSet get_mset (Xapian::doccount first, Xapian::doccount maxitems, const RSet *omrset, const MatchDecider *mdecider=0) const
 XAPIAN_DEPRECATED (static const int include_query_terms)
 Deprecated in Xapian 1.0.0, use INCLUDE_QUERY_TERMS instead.
 XAPIAN_DEPRECATED (static const int use_exact_termfreq)
 Deprecated in Xapian 1.0.0, use USE_EXACT_TERMFREQ instead.
ESet get_eset (Xapian::termcount maxitems, const RSet &omrset, int flags=0, double k=1.0, const Xapian::ExpandDecider *edecider=0) const
 Get the expand set for the given rset.
ESet get_eset (Xapian::termcount maxitems, const RSet &omrset, const Xapian::ExpandDecider *edecider) const
 Get the expand set for the given rset.
TermIterator get_matching_terms_begin (Xapian::docid did) const
 Get terms which match a given document, by document id.
TermIterator get_matching_terms_end (Xapian::docid) const
 End iterator corresponding to get_matching_terms_begin().
TermIterator get_matching_terms_begin (const MSetIterator &it) const
 Get terms which match a given document, by match set item.
TermIterator get_matching_terms_end (const MSetIterator &) const
 End iterator corresponding to get_matching_terms_begin().
 XAPIAN_DEPRECATED (void register_match_decider(const std::string &name, const MatchDecider *mdecider=NULL))
 Register a MatchDecider.
std::string get_description () const
 Return a string describing this object.

Public Attributes

Xapian::Internal::RefCntPtr<
Internal > 
internal

Static Public Attributes

static const int INCLUDE_QUERY_TERMS = 1
static const int USE_EXACT_TERMFREQ = 2


Detailed Description

This class provides an interface to the information retrieval system for the purpose of searching.

Databases are usually opened lazily, so exceptions may not be thrown where you would expect them to be. You should catch Xapian::Error exceptions when calling any method in Xapian::Enquire.

Exceptions:
Xapian::InvalidArgumentError will be thrown if an invalid argument is supplied, for example, an unknown database type.


Constructor & Destructor Documentation

Xapian::Enquire::Enquire ( const Enquire other  ) 

Copying is allowed (and is cheap).

Xapian::Enquire::Enquire ( const Database database,
ErrorHandler errorhandler_ = 0 
) [explicit]

Create a Xapian::Enquire object.

This specification cannot be changed once the Xapian::Enquire is opened: you must create a new Xapian::Enquire object to access a different database, or set of databases.

The database supplied must have been initialised (ie, must not be the result of calling the Database::Database() constructor). If you need to handle a situation where you have no index gracefully, a database created with InMemory::open() can be passed here, which represents a completely empty database.

Parameters:
database Specification of the database or databases to use.
errorhandler_ A pointer to the error handler to use. Ownership of the object pointed to is not assumed by the Xapian::Enquire object - the user should delete the Xapian::ErrorHandler object after the Xapian::Enquire object is deleted. To use no error handler, this parameter should be 0.
Exceptions:
Xapian::InvalidArgumentError will be thrown if an initialised Database object is supplied.

Xapian::Enquire::~Enquire (  ) 

Close the Xapian::Enquire object.


Member Function Documentation

void Xapian::Enquire::operator= ( const Enquire other  ) 

Assignment is allowed (and is cheap).

void Xapian::Enquire::set_query ( const Xapian::Query query,
Xapian::termcount  qlen = 0 
)

Set the query to run.

Parameters:
query the new query to run.
qlen the query length to use in weight calculations - by default the sum of the wqf of all terms is used.

const Xapian::Query& Xapian::Enquire::get_query (  )  const

Get the query which has been set.

This is only valid after set_query() has been called.

Exceptions:
Xapian::InvalidArgumentError will be thrown if query has not yet been set.

void Xapian::Enquire::set_weighting_scheme ( const Weight weight_  ) 

Set the weighting scheme to use for queries.

Parameters:
weight_ the new weighting scheme. If no weighting scheme is specified, the default is BM25 with the default parameters.

void Xapian::Enquire::set_collapse_key ( Xapian::valueno  collapse_key  ) 

Set the collapse key to use for queries.

Parameters:
collapse_key value number to collapse on - at most one MSet entry with each particular value will be returned.
The entry returned will be the best entry with that particular value (highest weight or highest sorting key).

An example use might be to create a value for each document containing an MD5 hash of the document contents. Then duplicate documents from different sources can be eliminated at search time (it's better to eliminate duplicates at index time, but this may not be always be possible - for example the search may be over more than one Xapian database).

Another use is to group matches in a particular category (e.g. you might collapse a mailing list search on the Subject: so that there's only one result per discussion thread). In this case you can use get_collapse_count() to give the user some idea how many other results there are. And if you index the Subject: as a boolean term as well as putting it in a value, you can offer a link to a non-collapsed search restricted to that thread using a boolean filter.

(default is Xapian::BAD_VALUENO which means no collapsing).

void Xapian::Enquire::set_docid_order ( docid_order  order  ) 

Set the direction in which documents are ordered by document id in the returned MSet.

This order only has an effect on documents which would otherwise have equal rank. For a weighted probabilistic match with no sort value, this means documents with equal weight. For a boolean match, with no sort value, this means all documents. And if a sort value is used, this means documents with equal sort value (and also equal weight if ordering on relevance after the sort).

Parameters:
order This can be:
  • Xapian::Enquire::ASCENDING docids sort in ascending order (default)
  • Xapian::Enquire::DESCENDING docids sort in descending order
  • Xapian::Enquire::DONT_CARE docids sort in whatever order is most efficient for the backend
Note: If you add documents in strict date order, then a boolean search - i.e. set_weighting_scheme(Xapian::BoolWeight()) - with set_docid_order(Xapian::Enquire::DESCENDING) is a very efficient way to perform "sort by date, newest first".

void Xapian::Enquire::set_cutoff ( Xapian::percent  percent_cutoff,
Xapian::weight  weight_cutoff = 0 
)

Set the percentage and/or weight cutoffs.

Parameters:
percent_cutoff Minimum percentage score for returned documents. If a document has a lower percentage score than this, it will not appear in the MSet. If your intention is to return only matches which contain all the terms in the query, then it's more efficient to use Xapian::Query::OP_AND instead of Xapian::Query::OP_OR in the query than to use set_cutoff(100). (default 0 => no percentage cut-off).
weight_cutoff Minimum weight for a document to be returned. If a document has a lower score that this, it will not appear in the MSet. It is usually only possible to choose an appropriate weight for cutoff based on the results of a previous run of the same query; this is thus mainly useful for alerting operations. The other potential use is with a user specified weighting scheme. (default 0 => no weight cut-off).

void Xapian::Enquire::set_sort_by_relevance (  ) 

Set the sorting to be by relevance only.

This is the default.

void Xapian::Enquire::set_sort_by_value ( Xapian::valueno  sort_key,
bool  ascending = true 
)

Set the sorting to be by value only.

NB sorting of values uses a string comparison, so you'll need to store numbers padded with leading zeros or spaces, or with the number of digits prepended.

Parameters:
sort_key value number to sort on.
ascending If true, documents values which sort higher by string compare are better. If false, the sort order is reversed. (default true)

void Xapian::Enquire::set_sort_by_key ( Xapian::Sorter sorter,
bool  ascending = true 
)

Set the sorting to be by key generated from values only.

Parameters:
sorter The functor to use for generating keys.
ascending If true, documents values which sort higher by string compare are better. If false, the sort order is reversed. (default true)

void Xapian::Enquire::set_sort_by_value_then_relevance ( Xapian::valueno  sort_key,
bool  ascending = true 
)

Set the sorting to be by value, then by relevance for documents with the same value.

NB sorting of values uses a string comparison, so you'll need to store numbers padded with leading zeros or spaces, or with the number of digits prepended.

Parameters:
sort_key value number to sort on.
ascending If true, documents values which sort higher by string compare are better. If false, the sort order is reversed. (default true)

void Xapian::Enquire::set_sort_by_key_then_relevance ( Xapian::Sorter sorter,
bool  ascending = true 
)

Set the sorting to be by keys generated from values, then by relevance for documents with identical keys.

Parameters:
sorter The functor to use for generating keys.
ascending If true, keys which sort higher by string compare are better. If false, the sort order is reversed. (default true)

void Xapian::Enquire::set_sort_by_relevance_then_value ( Xapian::valueno  sort_key,
bool  ascending = true 
)

Set the sorting to be by relevance then value.

NB sorting of values uses a string comparison, so you'll need to store numbers padded with leading zeros or spaces, or with the number of digits prepended.

Note that with the default BM25 weighting scheme parameters, non-identical documents will rarely have the same weight, so this setting will give very similar results to set_sort_by_relevance(). It becomes more useful with particular BM25 parameter settings (e.g. BM25Weight(1,0,1,0,0)) or custom weighting schemes.

Parameters:
sort_key value number to sort on.
ascending If true, documents values which sort higher by string compare are better. If false, the sort order is reversed. (default true)

void Xapian::Enquire::set_sort_by_relevance_then_key ( Xapian::Sorter sorter,
bool  ascending = true 
)

Set the sorting to be by relevance, then by keys generated from values.

Note that with the default BM25 weighting scheme parameters, non-identical documents will rarely have the same weight, so this setting will give very similar results to set_sort_by_relevance(). It becomes more useful with particular BM25 parameter settings (e.g. BM25Weight(1,0,1,0,0)) or custom weighting schemes.

Parameters:
sorter The functor to use for generating keys.
ascending If true, keys which sort higher by string compare are better. If false, the sort order is reversed. (default true)

MSet Xapian::Enquire::get_mset ( Xapian::doccount  first,
Xapian::doccount  maxitems,
Xapian::doccount  checkatleast = 0,
const RSet omrset = 0,
const MatchDecider mdecider = 0 
) const

Get (a portion of) the match set for the current query.

Parameters:
first the first item in the result set to return. A value of zero corresponds to the first item returned being that with the highest score. A value of 10 corresponds to the first 10 items being ignored, and the returned items starting at the eleventh.
maxitems the maximum number of items to return.
checkatleast the minimum number of items to check. Because the matcher optimises, it won't consider every document which might match, so the total number of matches is estimated. Setting checkatleast forces it to consider at least this many matches and so allows for reliable paging links.
omrset the relevance set to use when performing the query.
mdecider a decision functor to use to decide whether a given document should be put in the MSet.
matchspy a decision functor to use to decide whether a given document should be put in the MSet. The matchspy is applied to every document which is a potential candidate for the MSet, so if there are checkatleast or more such documents, the matchspy will see at least checkatleast. The mdecider is assumed to be a relatively expensive test so may be applied in a lazier fashion.
Returns:
A Xapian::MSet object containing the results of the query.
Exceptions:
Xapian::InvalidArgumentError See class documentation.

Xapian::Enquire::XAPIAN_DEPRECATED ( static const int  include_query_terms  ) 

Deprecated in Xapian 1.0.0, use INCLUDE_QUERY_TERMS instead.

Xapian::Enquire::XAPIAN_DEPRECATED ( static const int  use_exact_termfreq  ) 

Deprecated in Xapian 1.0.0, use USE_EXACT_TERMFREQ instead.

ESet Xapian::Enquire::get_eset ( Xapian::termcount  maxitems,
const RSet omrset,
int  flags = 0,
double  k = 1.0,
const Xapian::ExpandDecider edecider = 0 
) const

Get the expand set for the given rset.

Parameters:
maxitems the maximum number of items to return.
omrset the relevance set to use when performing the expand operation.
flags zero or more of these values |-ed together:
  • Xapian::Enquire::INCLUDE_QUERY_TERMS query terms may be returned from expand
  • Xapian::Enquire::USE_EXACT_TERMFREQ for multi dbs, calculate the exact termfreq; otherwise an approximation is used which can greatly improve efficiency, but still returns good results.
k the parameter k in the query expansion algorithm (default is 1.0)
edecider a decision functor to use to decide whether a given term should be put in the ESet
Returns:
An ESet object containing the results of the expand.
Exceptions:
Xapian::InvalidArgumentError See class documentation.

ESet Xapian::Enquire::get_eset ( Xapian::termcount  maxitems,
const RSet omrset,
const Xapian::ExpandDecider edecider 
) const [inline]

Get the expand set for the given rset.

Parameters:
maxitems the maximum number of items to return.
omrset the relevance set to use when performing the expand operation.
edecider a decision functor to use to decide whether a given term should be put in the ESet
Returns:
An ESet object containing the results of the expand.
Exceptions:
Xapian::InvalidArgumentError See class documentation.

TermIterator Xapian::Enquire::get_matching_terms_begin ( Xapian::docid  did  )  const

Get terms which match a given document, by document id.

This method returns the terms in the current query which match the given document.

It is possible for the document to have been removed from the database between the time it is returned in an MSet, and the time that this call is made. If possible, you should specify an MSetIterator instead of a Xapian::docid, since this will enable database backends with suitable support to prevent this occurring.

Note that a query does not need to have been run in order to make this call.

Parameters:
did The document id for which to retrieve the matching terms.
Returns:
An iterator returning the terms which match the document. The terms will be returned (as far as this makes any sense) in the same order as the terms in the query. Terms will not occur more than once, even if they do in the query.
Exceptions:
Xapian::InvalidArgumentError See class documentation.
Xapian::DocNotFoundError The document specified could not be found in the database.

TermIterator Xapian::Enquire::get_matching_terms_end ( Xapian::docid   )  const [inline]

End iterator corresponding to get_matching_terms_begin().

TermIterator Xapian::Enquire::get_matching_terms_begin ( const MSetIterator it  )  const

Get terms which match a given document, by match set item.

This method returns the terms in the current query which match the given document.

If the underlying database has suitable support, using this call (rather than passing a Xapian::docid) will enable the system to ensure that the correct data is returned, and that the document has not been deleted or changed since the query was performed.

Parameters:
it The iterator for which to retrieve the matching terms.
Returns:
An iterator returning the terms which match the document. The terms will be returned (as far as this makes any sense) in the same order as the terms in the query. Terms will not occur more than once, even if they do in the query.
Exceptions:
Xapian::InvalidArgumentError See class documentation.
Xapian::DocNotFoundError The document specified could not be found in the database.

TermIterator Xapian::Enquire::get_matching_terms_end ( const MSetIterator  )  const [inline]

End iterator corresponding to get_matching_terms_begin().

Xapian::Enquire::XAPIAN_DEPRECATED ( void   register_match_decider(const std::string &name, const MatchDecider *mdecider=NULL)  ) 

Register a MatchDecider.

This is used to associate a name with a matchdecider.

Deprecated:
This method is deprecated. It was added long ago with the intention that it would allow the remote backend to support use of MatchDecider objects, but there's a better approach.
Parameters:
name The name to register this matchdecider as.
mdecider The matchdecider. If omitted, then remove any matchdecider registered with this name.

std::string Xapian::Enquire::get_description (  )  const

Return a string describing this object.


The documentation for this class was generated from the following file:
Documentation for Xapian (version 1.0.10).
Generated on 23 Dec 2008 by Doxygen 1.5.2.