Package nltk :: Module probability
[hide private]
[frames] | no frames]

Module probability

source code

Classes [hide private]
  FreqDist
A frequency distribution for the outcomes of an experiment.
  ProbDistI
A probability distribution for the outcomes of an experiment.
  UniformProbDist
A probability distribution that assigns equal probability to each sample in a given set; and a zero probability to all other samples.
  DictionaryProbDist
A probability distribution whose probabilities are directly specified by a given dictionary.
  MLEProbDist
The maximum likelihood estimate for the probability distribution of the experiment used to generate a frequency distribution.
  LidstoneProbDist
The Lidstone estimate for the probability distribution of the experiment used to generate a frequency distribution.
  LaplaceProbDist
The Laplace estimate for the probability distribution of the experiment used to generate a frequency distribution.
  ELEProbDist
The expected likelihood estimate for the probability distribution of the experiment used to generate a frequency distribution.
  HeldoutProbDist
The heldout estimate for the probability distribution of the experiment used to generate two frequency distributions.
  CrossValidationProbDist
The cross-validation estimate for the probability distribution of the experiment used to generate a set of frequency distribution.
  WittenBellProbDist
The Witten-Bell estimate of a probability distribution.
  GoodTuringProbDist
The Good-Turing estimate of a probability distribution.
  MutableProbDist
An mutable probdist where the probabilities may be easily modified.
  ConditionalFreqDist
A collection of frequency distributions for a single experiment run under different conditions.
  ConditionalProbDistI
A collection of probability distributions for a single experiment run under different conditions.
  ConditionalProbDist
A conditional probability distribution modelling the experiments that were used to generate a conditional frequency distribution.
  DictionaryConditionalProbDist
An alternative ConditionalProbDist that simply wraps a dictionary of ProbDists rather than creating these from FreqDists.
  ProbabilisticMixIn
A mix-in class to associate probabilities with other classes (trees, rules, etc.).
  ImmutableProbabilisticMixIn
Functions [hide private]
 
log_likelihood(test_pdist, actual_pdist) source code
 
entropy(pdist) source code
 
add_logs(logx, logy)
Given two numbers logx=log(x) and logy=log(y), return log(x+y).
source code
 
sum_logs(logs) source code
 
_create_rand_fdist(numsamples, numoutcomes)
Create a new frequency distribution, with random samples.
source code
 
_create_sum_pdist(numsamples)
Return the true probability distribution for the experiment _create_rand_fdist(numsamples, x).
source code
None
demo(numsamples=6, numoutcomes=500)
A demonstration of frequency distributions and probability distributions.
source code
Variables [hide private]
  _NINF = -1e+300
Classes for representing and processing probabilistic information.
  _ADD_LOGS_MAX_DIFF = -99.6578428466
Function Details [hide private]

add_logs(logx, logy)

source code 

Given two numbers logx=log(x) and logy=log(y), return log(x+y). Conceptually, this is the same as returning log(2**(logx)+2**(logy)), but the actual implementation avoids overflow errors that could result from direct computation.

_create_rand_fdist(numsamples, numoutcomes)

source code 

Create a new frequency distribution, with random samples. The samples are numbers from 1 to numsamples, and are generated by summing two numbers, each of which has a uniform distribution.

demo(numsamples=6, numoutcomes=500)

source code 

A demonstration of frequency distributions and probability distributions. This demonstration creates three frequency distributions with, and uses them to sample a random process with numsamples samples. Each frequency distribution is sampled numoutcomes times. These three frequency distributions are then used to build six probability distributions. Finally, the probability estimates of these distributions are compared to the actual probability of each sample.

Parameters:
  • numsamples (int) - The number of samples to use in each demo frequency distributions.
  • numoutcomes (int) - The total number of outcomes for each demo frequency distribution. These outcomes are divided into numsamples bins.
Returns: None

Variables Details [hide private]

_NINF

Classes for representing and processing probabilistic information.

The FreqDist class is used to encode frequency distributions, which count the number of times that each outcome of an experiment occurs.

The ProbDistI class defines a standard interface for probability distributions, which encode the probability of each outcome for an experiment. There are two types of probability distribution:

  • derived probability distributions are created from frequency distributions. They attempt to model the probability distribution that generated the frequency distribution.
  • analytic probability distributions are created directly from parameters (such as variance).

The ConditionalFreqDist class and ConditionalProbDistI interface are used to encode conditional distributions. Conditional probability distributions can be derived or analytic; but currently the only implementation of the ConditionalProbDistI interface is ConditionalProbDist, a derived distribution.

Value:
-1e+300