Xapian::BM25Weight Class Reference

BM25 weighting scheme. More...

#include <enquire.h>

Inheritance diagram for Xapian::BM25Weight:

Inheritance graph
[legend]
Collaboration diagram for Xapian::BM25Weight:

Collaboration graph
[legend]
List of all members.

Public Member Functions

 BM25Weight (double k1_, double k2_, double k3_, double b_, double min_normlen_)
 Construct a BM25 weight.
 BM25Weight ()
BM25Weightclone () const
 Return a new weight object of this type.
 ~BM25Weight ()
std::string name () const
 Name of the weighting scheme.
std::string serialise () const
 Serialise object parameters into a string.
BM25Weightunserialise (const std::string &s) const
 Create object given string serialisation returned by serialise().
Xapian::weight get_sumpart (Xapian::termcount wdf, Xapian::doclength len) const
 Get a weight which is part of the sum over terms being performed.
Xapian::weight get_maxpart () const
 Gets the maximum value that get_sumpart() may return.
Xapian::weight get_sumextra (Xapian::doclength len) const
 Get an extra weight for a document to add to the sum calculated over the query terms.
Xapian::weight get_maxextra () const
 Gets the maximum value that get_sumextra() may return.
bool get_sumpart_needs_doclength () const
 return false if the weight object doesn't need doclength

Private Member Functions

void calc_termweight () const

Private Attributes

Xapian::weight termweight
Xapian::doclength lenpart
double k1
double k2
double k3
double b
Xapian::doclength min_normlen
bool weight_calculated

Detailed Description

BM25 weighting scheme.

BM25 weighting options : The BM25 formula is

\[ \frac{k_{2}.n_{q}}{1+L_{d}}+\sum_{t}\frac{(k_{3}+1)q_{t}}{k_{3}+q_{t}}.\frac{(k_{1}+1)f_{t,d}}{k_{1}((1-b)+bL_{d})+f_{t,d}}.w_{t} \]

where

Definition at line 1177 of file enquire.h.


Constructor & Destructor Documentation

Xapian::BM25Weight::BM25Weight ( double  k1_,
double  k2_,
double  k3_,
double  b_,
double  min_normlen_ 
) [inline]

Construct a BM25 weight.

Parameters:
k1 governs the importance of within document frequency. Must be >= 0. 0 means ignore wdf. Default is 1.
k2 compensation factor for the high wdf values in large documents. Must be >= 0. 0 means no compensation. Default is 0.
k3 governs the importance of within query frequency. Must be >= 0. 0 means ignore wqf. Default is 1.
b Relative importance of within document frequency and document length. Must be >= 0 and <= 1. Default is 0.5.
min_normlen specifies a cutoff on the minimum value that can be used for a normalised document length - smaller values will be forced up to this cutoff. This prevents very small documents getting a huge bonus weight. Default is 0.5.

Definition at line 1208 of file enquire.h.

Xapian::BM25Weight::BM25Weight (  )  [inline]

Definition at line 1218 of file enquire.h.

Referenced by clone().

Xapian::BM25Weight::~BM25Weight (  )  [inline]

Definition at line 1222 of file enquire.h.


Member Function Documentation

void Xapian::BM25Weight::calc_termweight (  )  const [private]

Definition at line 68 of file bm25weight.cc.

References Assert, Xapian::Weight::Internal::collection_size, DEBUGCALL, DEBUGLINE, Xapian::Weight::internal, k3, lenpart, Xapian::Weight::Internal::rset_size, termweight, weight_calculated, and Xapian::Weight::wqf.

Referenced by get_maxpart(), get_sumextra(), get_sumpart(), and get_sumpart_needs_doclength().

BM25Weight * Xapian::BM25Weight::clone (  )  const [virtual]

Return a new weight object of this type.

A subclass called FooWeight taking parameters param1 and param2 should implement this as:

virtual FooWeight * clone() const { return new FooWeight(param1, param2); }

Implements Xapian::Weight.

Definition at line 39 of file bm25weight.cc.

References b, BM25Weight(), k1, k2, k3, and min_normlen.

string Xapian::BM25Weight::name (  )  const [virtual]

Name of the weighting scheme.

If the subclass is called FooWeight, this should return "Foo".

Implements Xapian::Weight.

Definition at line 43 of file bm25weight.cc.

Referenced by DEFINE_TESTCASE().

string Xapian::BM25Weight::serialise (  )  const [virtual]

Serialise object parameters into a string.

Implements Xapian::Weight.

Definition at line 45 of file bm25weight.cc.

References b, k1, k2, k3, min_normlen, and serialise_double().

Referenced by DEFINE_TESTCASE().

BM25Weight* Xapian::BM25Weight::unserialise ( const std::string &  s  )  const [virtual]

Create object given string serialisation returned by serialise().

Implements Xapian::Weight.

Xapian::weight Xapian::BM25Weight::get_sumpart ( Xapian::termcount  wdf,
Xapian::doclength  len 
) const [virtual]

Get a weight which is part of the sum over terms being performed.

This returns a weight for a given term and document. These weights are summed to give a total weight for the document.

Parameters:
wdf the within document frequency of the term.
len the (unnormalised) document length.

Implements Xapian::Weight.

Definition at line 118 of file bm25weight.cc.

References b, calc_termweight(), DEBUGCALL, DEBUGLINE, k1, lenpart, min_normlen, RETURN, termweight, and weight_calculated.

Xapian::weight Xapian::BM25Weight::get_maxpart (  )  const [virtual]

Gets the maximum value that get_sumpart() may return.

This is used in optimising searches, by having the postlist tree decay appropriately when parts of it can have limited, or no, further effect.

Implements Xapian::Weight.

Definition at line 144 of file bm25weight.cc.

References calc_termweight(), DEBUGCALL, k1, RETURN, termweight, and weight_calculated.

Xapian::weight Xapian::BM25Weight::get_sumextra ( Xapian::doclength  len  )  const [virtual]

Get an extra weight for a document to add to the sum calculated over the query terms.

This returns a weight for a given document, and is used by some weighting schemes to account for influence such as document length.

Parameters:
len the (unnormalised) document length.

Implements Xapian::Weight.

Definition at line 156 of file bm25weight.cc.

References calc_termweight(), DEBUGCALL, DEBUGLINE, k2, lenpart, min_normlen, Xapian::Weight::querysize, RETURN, and weight_calculated.

Xapian::weight Xapian::BM25Weight::get_maxextra (  )  const [virtual]

Gets the maximum value that get_sumextra() may return.

This is used in optimising searches.

Implements Xapian::Weight.

Definition at line 170 of file bm25weight.cc.

References DEBUGCALL, DEBUGLINE, k2, Xapian::Weight::querysize, and RETURN.

bool Xapian::BM25Weight::get_sumpart_needs_doclength (  )  const [virtual]

return false if the weight object doesn't need doclength

Reimplemented from Xapian::Weight.

Definition at line 179 of file bm25weight.cc.

References b, calc_termweight(), k1, lenpart, and weight_calculated.


Member Data Documentation

Xapian::weight Xapian::BM25Weight::termweight [mutable, private]

Definition at line 1179 of file enquire.h.

Referenced by calc_termweight(), get_maxpart(), and get_sumpart().

Xapian::doclength Xapian::BM25Weight::lenpart [mutable, private]

Definition at line 1180 of file enquire.h.

Referenced by calc_termweight(), get_sumextra(), get_sumpart(), and get_sumpart_needs_doclength().

double Xapian::BM25Weight::k1 [private]

Definition at line 1182 of file enquire.h.

Referenced by clone(), get_maxpart(), get_sumpart(), get_sumpart_needs_doclength(), and serialise().

double Xapian::BM25Weight::k2 [private]

Definition at line 1182 of file enquire.h.

Referenced by clone(), get_maxextra(), get_sumextra(), and serialise().

double Xapian::BM25Weight::k3 [private]

Definition at line 1182 of file enquire.h.

Referenced by calc_termweight(), clone(), and serialise().

double Xapian::BM25Weight::b [private]

Definition at line 1182 of file enquire.h.

Referenced by clone(), get_sumpart(), get_sumpart_needs_doclength(), and serialise().

Xapian::doclength Xapian::BM25Weight::min_normlen [private]

Definition at line 1183 of file enquire.h.

Referenced by clone(), get_sumextra(), get_sumpart(), and serialise().

bool Xapian::BM25Weight::weight_calculated [mutable, private]

Definition at line 1185 of file enquire.h.

Referenced by calc_termweight(), get_maxpart(), get_sumextra(), get_sumpart(), and get_sumpart_needs_doclength().


The documentation for this class was generated from the following files:
Documentation for Xapian (version 1.0.10).
Generated on 24 Dec 2008 by Doxygen 1.5.2.