#include <enquire.h>
Inheritance diagram for Xapian::BM25Weight:
Public Member Functions | |
BM25Weight (double k1_, double k2_, double k3_, double b_, double min_normlen_) | |
Construct a BM25 weight. | |
BM25Weight () | |
BM25Weight * | clone () const |
Return a new weight object of this type. | |
~BM25Weight () | |
std::string | name () const |
Name of the weighting scheme. | |
std::string | serialise () const |
Serialise object parameters into a string. | |
BM25Weight * | unserialise (const std::string &s) const |
Create object given string serialisation returned by serialise(). | |
Xapian::weight | get_sumpart (Xapian::termcount wdf, Xapian::doclength len) const |
Get a weight which is part of the sum over terms being performed. | |
Xapian::weight | get_maxpart () const |
Gets the maximum value that get_sumpart() may return. | |
Xapian::weight | get_sumextra (Xapian::doclength len) const |
Get an extra weight for a document to add to the sum calculated over the query terms. | |
Xapian::weight | get_maxextra () const |
Gets the maximum value that get_sumextra() may return. | |
bool | get_sumpart_needs_doclength () const |
return false if the weight object doesn't need doclength | |
Private Member Functions | |
void | calc_termweight () const |
Private Attributes | |
Xapian::weight | termweight |
Xapian::doclength | lenpart |
double | k1 |
double | k2 |
double | k3 |
double | b |
Xapian::doclength | min_normlen |
bool | weight_calculated |
BM25 weighting options : The BM25 formula is
where
Definition at line 1177 of file enquire.h.
Xapian::BM25Weight::BM25Weight | ( | double | k1_, | |
double | k2_, | |||
double | k3_, | |||
double | b_, | |||
double | min_normlen_ | |||
) | [inline] |
Construct a BM25 weight.
k1 | governs the importance of within document frequency. Must be >= 0. 0 means ignore wdf. Default is 1. | |
k2 | compensation factor for the high wdf values in large documents. Must be >= 0. 0 means no compensation. Default is 0. | |
k3 | governs the importance of within query frequency. Must be >= 0. 0 means ignore wqf. Default is 1. | |
b | Relative importance of within document frequency and document length. Must be >= 0 and <= 1. Default is 0.5. | |
min_normlen | specifies a cutoff on the minimum value that can be used for a normalised document length - smaller values will be forced up to this cutoff. This prevents very small documents getting a huge bonus weight. Default is 0.5. |
Xapian::BM25Weight::BM25Weight | ( | ) | [inline] |
void Xapian::BM25Weight::calc_termweight | ( | ) | const [private] |
Definition at line 68 of file bm25weight.cc.
References Assert, Xapian::Weight::Internal::collection_size, DEBUGCALL, DEBUGLINE, Xapian::Weight::internal, k3, lenpart, Xapian::Weight::Internal::rset_size, termweight, weight_calculated, and Xapian::Weight::wqf.
Referenced by get_maxpart(), get_sumextra(), get_sumpart(), and get_sumpart_needs_doclength().
BM25Weight * Xapian::BM25Weight::clone | ( | ) | const [virtual] |
Return a new weight object of this type.
A subclass called FooWeight taking parameters param1 and param2 should implement this as:
virtual FooWeight * clone() const { return new FooWeight(param1, param2); }
Implements Xapian::Weight.
Definition at line 39 of file bm25weight.cc.
References b, BM25Weight(), k1, k2, k3, and min_normlen.
string Xapian::BM25Weight::name | ( | ) | const [virtual] |
Name of the weighting scheme.
If the subclass is called FooWeight, this should return "Foo".
Implements Xapian::Weight.
Definition at line 43 of file bm25weight.cc.
Referenced by DEFINE_TESTCASE().
string Xapian::BM25Weight::serialise | ( | ) | const [virtual] |
Serialise object parameters into a string.
Implements Xapian::Weight.
Definition at line 45 of file bm25weight.cc.
References b, k1, k2, k3, min_normlen, and serialise_double().
Referenced by DEFINE_TESTCASE().
BM25Weight* Xapian::BM25Weight::unserialise | ( | const std::string & | s | ) | const [virtual] |
Xapian::weight Xapian::BM25Weight::get_sumpart | ( | Xapian::termcount | wdf, | |
Xapian::doclength | len | |||
) | const [virtual] |
Get a weight which is part of the sum over terms being performed.
This returns a weight for a given term and document. These weights are summed to give a total weight for the document.
wdf | the within document frequency of the term. | |
len | the (unnormalised) document length. |
Implements Xapian::Weight.
Definition at line 118 of file bm25weight.cc.
References b, calc_termweight(), DEBUGCALL, DEBUGLINE, k1, lenpart, min_normlen, RETURN, termweight, and weight_calculated.
Xapian::weight Xapian::BM25Weight::get_maxpart | ( | ) | const [virtual] |
Gets the maximum value that get_sumpart() may return.
This is used in optimising searches, by having the postlist tree decay appropriately when parts of it can have limited, or no, further effect.
Implements Xapian::Weight.
Definition at line 144 of file bm25weight.cc.
References calc_termweight(), DEBUGCALL, k1, RETURN, termweight, and weight_calculated.
Xapian::weight Xapian::BM25Weight::get_sumextra | ( | Xapian::doclength | len | ) | const [virtual] |
Get an extra weight for a document to add to the sum calculated over the query terms.
This returns a weight for a given document, and is used by some weighting schemes to account for influence such as document length.
len | the (unnormalised) document length. |
Implements Xapian::Weight.
Definition at line 156 of file bm25weight.cc.
References calc_termweight(), DEBUGCALL, DEBUGLINE, k2, lenpart, min_normlen, Xapian::Weight::querysize, RETURN, and weight_calculated.
Xapian::weight Xapian::BM25Weight::get_maxextra | ( | ) | const [virtual] |
Gets the maximum value that get_sumextra() may return.
This is used in optimising searches.
Implements Xapian::Weight.
Definition at line 170 of file bm25weight.cc.
References DEBUGCALL, DEBUGLINE, k2, Xapian::Weight::querysize, and RETURN.
bool Xapian::BM25Weight::get_sumpart_needs_doclength | ( | ) | const [virtual] |
return false if the weight object doesn't need doclength
Reimplemented from Xapian::Weight.
Definition at line 179 of file bm25weight.cc.
References b, calc_termweight(), k1, lenpart, and weight_calculated.
Xapian::weight Xapian::BM25Weight::termweight [mutable, private] |
Definition at line 1179 of file enquire.h.
Referenced by calc_termweight(), get_maxpart(), and get_sumpart().
Xapian::doclength Xapian::BM25Weight::lenpart [mutable, private] |
Definition at line 1180 of file enquire.h.
Referenced by calc_termweight(), get_sumextra(), get_sumpart(), and get_sumpart_needs_doclength().
double Xapian::BM25Weight::k1 [private] |
Definition at line 1182 of file enquire.h.
Referenced by clone(), get_maxpart(), get_sumpart(), get_sumpart_needs_doclength(), and serialise().
double Xapian::BM25Weight::k2 [private] |
Definition at line 1182 of file enquire.h.
Referenced by clone(), get_maxextra(), get_sumextra(), and serialise().
double Xapian::BM25Weight::k3 [private] |
Definition at line 1182 of file enquire.h.
Referenced by calc_termweight(), clone(), and serialise().
double Xapian::BM25Weight::b [private] |
Definition at line 1182 of file enquire.h.
Referenced by clone(), get_sumpart(), get_sumpart_needs_doclength(), and serialise().
Definition at line 1183 of file enquire.h.
Referenced by clone(), get_sumextra(), get_sumpart(), and serialise().
bool Xapian::BM25Weight::weight_calculated [mutable, private] |
Definition at line 1185 of file enquire.h.
Referenced by calc_termweight(), get_maxpart(), get_sumextra(), get_sumpart(), and get_sumpart_needs_doclength().