__init__(self,
freqdist,
bins=None)
(Constructor)
| source code
|
Creates a distribution of Witten-Bell probability estimates. This
distribution allocates uniform probability mass to as yet unseen events
by using the number of events that have only been seen once. The
probability mass reserved for unseen events is equal to:
where T is the number of observed event types and
N is the total number of observed events. This
equates to the maximum likelihood estimate of a new type event occuring.
The remaining probability mass is discounted such that all probability
estimates sum to one, yielding:
-
p = T / Z (N + T), if count = 0
-
p = c / (N + T), otherwise
The parameters T and N are
taken from the freqdist parameter (the B() and
N() values). The normalising factor Z is
calculated using these values along with the bins
parameter.
- Parameters:
freqdist (FreqDist ) - The frequency counts upon which to base the estimation.
bins (Int ) - The number of possible event types. This must be at least as
large as the number of bins in the freqdist . If
None , then it's assumed to be equal to that of the
freqdist
- Overrides:
ProbDistI.__init__
|