com.rapidminer.operator.learner.igss
Class IteratingGSS

java.lang.Object
  extended by com.rapidminer.tools.AbstractObservable<Operator>
      extended by com.rapidminer.operator.Operator
          extended by com.rapidminer.operator.learner.AbstractLearner
              extended by com.rapidminer.operator.learner.igss.IteratingGSS
All Implemented Interfaces:
ConfigurationListener, PreviewListener, ResourceConsumer, CapabilityProvider, Learner, ParameterHandler, LoggingHandler, Observable<Operator>

public class IteratingGSS
extends AbstractLearner

This operator implements the IteratingGSS algorithms presented in the diploma thesis 'Effiziente Entdeckung unabhängiger Subgruppen in grossen Datenbanken' at the Department of Computer Science, University of Dortmund.

Author:
Dirk Dach

Field Summary
static java.lang.String[] CRITERION_TYPES
           
static int FIRST_TYPE_INDEX
           
static int LAST_TYPE_INDEX
           
 int MIN_MODEL_NUMBER
          minimal model number for example_criterion
static java.lang.String PARAMETER_DELTA
          The parameter name for "desired confidence"
static java.lang.String PARAMETER_EPSILON
          The parameter name for "approximation parameter"
static java.lang.String PARAMETER_EXAMPLE_FACTOR
          The parameter name for "used by example criterion to determine usefulness of a hypothesis"
static java.lang.String PARAMETER_FORCE_ITERATIONS
          The parameter name for "make all iterations even if termination criterion is met"
static java.lang.String PARAMETER_GENERATE_ALL_HYPOTHESIS
          The parameter name for "generate h->Y+/Y- or h->Y+ only.
static java.lang.String PARAMETER_ITERATIONS
          The parameter name for "the number of iterations"
static java.lang.String PARAMETER_LARGE
          The parameter name for "the number of examples a hypothesis must cover before normal approximation is used"
static java.lang.String PARAMETER_MAX_COMPLEXITY
          The parameter name for "the maximum complexity of hypothesis"
static java.lang.String PARAMETER_MIN_COMPLEXITY
          The parameter name for "the minimum complexity of hypothesis"
static java.lang.String PARAMETER_MIN_UTILITY_PRUNING
          The parameter name for "minimum utility used for pruning"
static java.lang.String PARAMETER_MIN_UTILITY_USEFUL
          The parameter name for "minimum utility for the usefulness of a rule"
static java.lang.String PARAMETER_REJECTION_SAMPLING
          The parameter name for "use rejection sampling instead of weighted examples"
static java.lang.String PARAMETER_RESET_WEIGHTS
          The parameter name for "Set weights back to 1 when complexity is increased.
static java.lang.String PARAMETER_STEPSIZE
          The parameter name for "the number of examples drawn before the next hypothesis update"
static java.lang.String PARAMETER_USE_BINOMIAL
          The parameter name for "Switch to binomial utility funtion before increasing complexity"
static java.lang.String PARAMETER_USE_KBS
          The parameter name for "use kbs to reweight examples after each iteration"
static java.lang.String PARAMETER_USEFUL_CRITERION
          The parameter name for "criterion to decide if the complexity is increased "
static java.lang.String PARAMETER_UTILITY_FUNCTION
          The parameter name for "the utility function to be used"
static int TYPE_BEST_UTILITY
           
static int TYPE_EXAMPLE
           
static int TYPE_UTILITY
           
static int TYPE_WORST_UTILITY
           
 
Fields inherited from interface com.rapidminer.operator.learner.CapabilityProvider
PROPERTY_RAPIDMINER_GENERAL_CAPABILITIES_WARN
 
Constructor Summary
IteratingGSS(OperatorDescription description)
          Must pass the given object to the superclass.
 
Method Summary
 java.util.LinkedList<Hypothesis> generate(java.util.LinkedList<Hypothesis> oldHypothesis)
          Generates all successors of the hypothesis in the given list.
 java.lang.Class<? extends PredictionModel> getModelClass()
          This method might be overridden from subclasses in order to specify exactly which model class they use.
 java.util.List<ParameterType> getParameterTypes()
          Returns a list of ParameterTypes describing the parameters of this operator.
 java.util.LinkedList<Result> gss(ExampleSet exampleSet, java.util.LinkedList<Hypothesis> hypothesisList, double delta, double epsilon)
          Returns the n best hypothesis with maximum error epsilon with confidence 1-delta.
 boolean isUseful(Result current, java.util.LinkedList<Result> otherResults, int criterion, ExampleSet exampleSet, int min_model_number)
          Test if the model is useful according to the given criterion.
 Model learn(ExampleSet eSet)
          Trains a model.
static double log2(double arg)
          Returns the logarithm to base 2
 java.util.LinkedList<Hypothesis> prune(java.util.LinkedList<Hypothesis> hypoList, double minUtility, double totalWeight, double totalPositiveWeight, double delta_p)
          Prunes the given list of hypothesis.
 ContingencyMatrix reweight(ExampleSet exampleSet, Model model, boolean normalize)
          Reweights the examples according to knowledge based sampling.
 boolean supportsCapability(OperatorCapability lc)
          Checks for Learner capabilities.
 
Methods inherited from class com.rapidminer.operator.learner.AbstractLearner
canCalculateWeights, canEstimatePerformance, doWork, doWork, getEstimatedPerformance, getExampleSetInputPort, getOptimizationPerformance, getWeightCalculationError, getWeights, getWeights, onlyWarnForNonSufficientCapabilities, shouldAutoConnect, shouldCalculateWeights, shouldDeliverOptimizationPerformance, shouldEstimatePerformance
 
Methods inherited from class com.rapidminer.operator.Operator
acceptsInput, addError, addError, addValue, addWarning, apply, apply, assumePreconditionsSatisfied, checkAll, checkAllExcludingMetaData, checkDeprecations, checkForStop, checkIO, checkProperties, clear, clearErrorList, cloneOperator, collectErrors, createExperimentTree, createExperimentTree, createFromXML, createFromXML, createFromXML, createMarkedExperimentTree, createMarkedProcessTree, createProcessTree, createProcessTree, disconnectPorts, execute, fireUpdate, freeMemory, getAddOnlyAdditionalOutput, getApplyCount, getCompatibilityLevel, getDeliveredOutputClasses, getDeprecationInfo, getDesiredInputClasses, getDOMRepresentation, getEncoding, getErrorList, getExecutionUnit, getExperiment, getIncompatibleVersionChanges, getInput, getInput, getInput, getInputClasses, getInputDescription, getInputPorts, getIODescription, getLog, getLogger, getName, getNumberOfBreakpoints, getOperatorClassName, getOperatorDescription, getOutputClasses, getOutputPorts, getParameter, getParameterAsBoolean, getParameterAsChar, getParameterAsColor, getParameterAsDouble, getParameterAsFile, getParameterAsFile, getParameterAsInputStream, getParameterAsInt, getParameterAsMatrix, getParameterAsRepositoryLocation, getParameterAsString, getParameterHandler, getParameterList, getParameters, getParameterTupel, getParameterType, getParent, getPortOwner, getProcess, getResourceConsumptionEstimator, getRoot, getStartTime, getTransformer, getUserDescription, getValue, getValues, getXML, getXML, getXML, hasBreakpoint, hasBreakpoint, hasInput, inApplyLoop, isDebugMode, isDirty, isEnabled, isExpanded, isParallel, isParameterSet, isRunning, log, log, logError, logNote, logWarning, lookupOperator, makeDirty, makeDirtyOnUpdate, notifyRenaming, performAdditionalChecks, preAutoWire, processFinished, processStarts, producesOutput, propagateDirtyness, register, registerOperator, remove, removeAndKeepConnections, rename, resume, setBreakpoint, setCompatibilityLevel, setEnabled, setEnclosingProcess, setExpanded, setInput, setListParameter, setPairParameter, setParameter, setParameters, setUserDescription, shouldAutoConnect, shouldStopStandaloneExecution, toString, transformMetaData, unregisterOperator, updateExecutionOrder, walk, writeXML, writeXML
 
Methods inherited from class com.rapidminer.tools.AbstractObservable
addObserver, addObserverAsFirst, fireUpdate, removeObserver
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait
 
Methods inherited from interface com.rapidminer.operator.learner.Learner
getName
 

Field Detail

PARAMETER_EPSILON

public static final java.lang.String PARAMETER_EPSILON
The parameter name for "approximation parameter"

See Also:
Constant Field Values

PARAMETER_DELTA

public static final java.lang.String PARAMETER_DELTA
The parameter name for "desired confidence"

See Also:
Constant Field Values

PARAMETER_MIN_UTILITY_PRUNING

public static final java.lang.String PARAMETER_MIN_UTILITY_PRUNING
The parameter name for "minimum utility used for pruning"

See Also:
Constant Field Values

PARAMETER_MIN_UTILITY_USEFUL

public static final java.lang.String PARAMETER_MIN_UTILITY_USEFUL
The parameter name for "minimum utility for the usefulness of a rule"

See Also:
Constant Field Values

PARAMETER_STEPSIZE

public static final java.lang.String PARAMETER_STEPSIZE
The parameter name for "the number of examples drawn before the next hypothesis update"

See Also:
Constant Field Values

PARAMETER_LARGE

public static final java.lang.String PARAMETER_LARGE
The parameter name for "the number of examples a hypothesis must cover before normal approximation is used"

See Also:
Constant Field Values

PARAMETER_MAX_COMPLEXITY

public static final java.lang.String PARAMETER_MAX_COMPLEXITY
The parameter name for "the maximum complexity of hypothesis"

See Also:
Constant Field Values

PARAMETER_MIN_COMPLEXITY

public static final java.lang.String PARAMETER_MIN_COMPLEXITY
The parameter name for "the minimum complexity of hypothesis"

See Also:
Constant Field Values

PARAMETER_ITERATIONS

public static final java.lang.String PARAMETER_ITERATIONS
The parameter name for "the number of iterations"

See Also:
Constant Field Values

PARAMETER_USE_BINOMIAL

public static final java.lang.String PARAMETER_USE_BINOMIAL
The parameter name for "Switch to binomial utility funtion before increasing complexity"

See Also:
Constant Field Values

PARAMETER_UTILITY_FUNCTION

public static final java.lang.String PARAMETER_UTILITY_FUNCTION
The parameter name for "the utility function to be used"

See Also:
Constant Field Values

PARAMETER_USE_KBS

public static final java.lang.String PARAMETER_USE_KBS
The parameter name for "use kbs to reweight examples after each iteration"

See Also:
Constant Field Values

PARAMETER_REJECTION_SAMPLING

public static final java.lang.String PARAMETER_REJECTION_SAMPLING
The parameter name for "use rejection sampling instead of weighted examples"

See Also:
Constant Field Values

PARAMETER_USEFUL_CRITERION

public static final java.lang.String PARAMETER_USEFUL_CRITERION
The parameter name for "criterion to decide if the complexity is increased "

See Also:
Constant Field Values

PARAMETER_EXAMPLE_FACTOR

public static final java.lang.String PARAMETER_EXAMPLE_FACTOR
The parameter name for "used by example criterion to determine usefulness of a hypothesis"

See Also:
Constant Field Values

PARAMETER_FORCE_ITERATIONS

public static final java.lang.String PARAMETER_FORCE_ITERATIONS
The parameter name for "make all iterations even if termination criterion is met"

See Also:
Constant Field Values

PARAMETER_GENERATE_ALL_HYPOTHESIS

public static final java.lang.String PARAMETER_GENERATE_ALL_HYPOTHESIS
The parameter name for "generate h->Y+/Y- or h->Y+ only."

See Also:
Constant Field Values

PARAMETER_RESET_WEIGHTS

public static final java.lang.String PARAMETER_RESET_WEIGHTS
The parameter name for "Set weights back to 1 when complexity is increased."

See Also:
Constant Field Values

CRITERION_TYPES

public static final java.lang.String[] CRITERION_TYPES

FIRST_TYPE_INDEX

public static final int FIRST_TYPE_INDEX
See Also:
Constant Field Values

TYPE_WORST_UTILITY

public static final int TYPE_WORST_UTILITY
See Also:
Constant Field Values

TYPE_UTILITY

public static final int TYPE_UTILITY
See Also:
Constant Field Values

TYPE_BEST_UTILITY

public static final int TYPE_BEST_UTILITY
See Also:
Constant Field Values

TYPE_EXAMPLE

public static final int TYPE_EXAMPLE
See Also:
Constant Field Values

LAST_TYPE_INDEX

public static final int LAST_TYPE_INDEX
See Also:
Constant Field Values

MIN_MODEL_NUMBER

public int MIN_MODEL_NUMBER
minimal model number for example_criterion

Constructor Detail

IteratingGSS

public IteratingGSS(OperatorDescription description)
Must pass the given object to the superclass.

Method Detail

gss

public java.util.LinkedList<Result> gss(ExampleSet exampleSet,
                                        java.util.LinkedList<Hypothesis> hypothesisList,
                                        double delta,
                                        double epsilon)
                                 throws OperatorException
Returns the n best hypothesis with maximum error epsilon with confidence 1-delta.

Throws:
OperatorException

reweight

public ContingencyMatrix reweight(ExampleSet exampleSet,
                                  Model model,
                                  boolean normalize)
                           throws OperatorException
Reweights the examples according to knowledge based sampling. Normalizes weights to [0,1] if the parameter normalize is set to true.

Throws:
OperatorException

learn

public Model learn(ExampleSet eSet)
            throws OperatorException
Description copied from interface: Learner
Trains a model. This method should be called by apply() and is implemented by subclasses.

Throws:
OperatorException

isUseful

public boolean isUseful(Result current,
                        java.util.LinkedList<Result> otherResults,
                        int criterion,
                        ExampleSet exampleSet,
                        int min_model_number)
Test if the model is useful according to the given criterion.


prune

public java.util.LinkedList<Hypothesis> prune(java.util.LinkedList<Hypothesis> hypoList,
                                              double minUtility,
                                              double totalWeight,
                                              double totalPositiveWeight,
                                              double delta_p)
Prunes the given list of hypothesis. All hypothesis with an upper utility bound less than the parameter minUtility is pruned.


generate

public java.util.LinkedList<Hypothesis> generate(java.util.LinkedList<Hypothesis> oldHypothesis)
Generates all successors of the hypothesis in the given list.


log2

public static double log2(double arg)
Returns the logarithm to base 2


getModelClass

public java.lang.Class<? extends PredictionModel> getModelClass()
Description copied from class: AbstractLearner
This method might be overridden from subclasses in order to specify exactly which model class they use. This is to ensure the proper postprocessing of some models like KernelModels (SupportVectorCounter) or TreeModels (Rule generation)

Overrides:
getModelClass in class AbstractLearner

supportsCapability

public boolean supportsCapability(OperatorCapability lc)
Description copied from interface: CapabilityProvider
Checks for Learner capabilities. Should return true if the given capability is supported.


getParameterTypes

public java.util.List<ParameterType> getParameterTypes()
Description copied from class: Operator
Returns a list of ParameterTypes describing the parameters of this operator. The default implementation returns an empty list if no input objects can be retained and special parameters for those input objects which can be prevented from being consumed. ATTENTION! This will create new parameterTypes. For calling already existing parameter types use getParameters().getParameterTypes();

Specified by:
getParameterTypes in interface ParameterHandler
Overrides:
getParameterTypes in class Operator


Copyright © 2001-2009 by Rapid-I