public class GroupByNodeModel extends NodeModel
NodeModel
implementation of the group by node which uses the
GroupByTable
class implementations to create the resulting table.Modifier and Type | Field and Description |
---|---|
protected static String |
CFG_COLUMN_NAME_POLICY
Configuration key for the aggregation column name policy.
|
(package private) static String |
CFG_DATA_TYPE_AGGREGATORS
Configuration key for the data type based aggregation methods.
|
protected static String |
CFG_ENABLE_HILITE
Configuration key for the enable hilite option.
|
protected static String |
CFG_GROUP_BY_COLUMNS
Configuration key of the selected group by columns.
|
protected static String |
CFG_IN_MEMORY
Configuration key for the in memory option.
|
protected static String |
CFG_MAX_UNIQUE_VALUES
Configuration key for the maximum none numerical values.
|
(package private) static String |
CFG_PATTERN_AGGREGATORS
Configuration key for the pattern based aggregation methods.
|
protected static String |
CFG_RETAIN_ORDER
Configuration key for the retain order option.
|
protected static String |
CFG_SORT_IN_MEMORY
Deprecated.
|
protected static String |
CFG_VALUE_DELIMITER
Configuration key for the value delimiter option.
|
Constructor and Description |
---|
GroupByNodeModel()
Creates a new group by model with one in- and one out-port.
|
GroupByNodeModel(int ins,
int outs)
Creates a new group by model.
|
Modifier and Type | Method and Description |
---|---|
static void |
checkDuplicateAggregators(ColumnNamePolicy namePolicy,
List<ColumnAggregator> aggregators) |
protected static List<ColumnAggregator> |
compGetColumnMethods(DataTableSpec spec,
List<String> excludeCols,
ConfigRO config)
Compatibility method used for compatibility to versions prior Knime 2.0.
|
protected static ColumnNamePolicy |
compGetColumnNamePolicy(NodeSettingsRO settings)
Compatibility method used for compatibility to versions prior Knime 2.0.
|
protected DataTableSpec[] |
configure(PortObjectSpec[] inSpecs)
Configure method for general port types.
|
protected GlobalSettings |
createGlobalSettings(ExecutionContext exec,
BufferedDataTable table,
List<String> groupByCols,
int maxUniqueVals)
Creates the
GlobalSettings object that is passed to all
AggregationMethod s. |
protected DataTableSpec |
createGroupBySpec(DataTableSpec origSpec,
List<String> groupByCols)
Generate table spec based on the input spec and the selected columns
for grouping.
|
protected GroupByTable |
createGroupByTable(ExecutionContext exec,
BufferedDataTable table,
List<String> groupByCols)
Create group-by table.
|
protected GroupByTable |
createGroupByTable(ExecutionContext exec,
BufferedDataTable table,
List<String> groupByCols,
boolean inMemory,
boolean sortInMemory,
boolean retainOrder,
List<ColumnAggregator> aggregators)
Deprecated.
sortInMemory is no longer required
|
protected GroupByTable |
createGroupByTable(ExecutionContext exec,
BufferedDataTable table,
List<String> groupByCols,
boolean inMemory,
boolean retainOrder,
List<ColumnAggregator> aggregators)
Create group-by table.
|
(package private) static SettingsModelInteger |
createVersionModel() |
protected PortObject[] |
execute(PortObject[] inData,
ExecutionContext exec)
Execute method for general port types.
|
static List<ColumnAggregator> |
getAggregators(DataTableSpec inputSpec,
Collection<String> groupColumns,
List<ColumnAggregator> columnAggregators,
Collection<PatternAggregator> patternAggregators,
Collection<DataTypeAggregator> dataTypeAggregators,
List<ColumnAggregator> invalidColAggrs)
Creates a
List with all ColumnAggregator s to use based on the given input settings. |
protected List<ColumnAggregator> |
getColumnAggregators() |
protected ColumnNamePolicy |
getColumnNamePolicy() |
protected String |
getDefaultValueDelimiter() |
protected static Collection<String> |
getExcludeList(DataTableSpec origSpec,
List<String> columns)
Determine list of column not present in the original spec.
|
protected List<String> |
getGroupByColumns()
Returns list of columns selected for group-by operation.
|
protected HiLiteHandler |
getOutHiLiteHandler(int outIndex)
Returns the
HiLiteHandler for the given output index. |
protected void |
inMemoryChanged()
Deprecated.
obsolete to be notified when consistent settings are
loaded into the model (since 2.6)
|
protected boolean |
isProcessInMemory() |
protected boolean |
isRetainOrder() |
protected boolean |
isSortInMemory()
Deprecated.
sort in memory is no longer required
|
protected void |
loadInternals(File nodeInternDir,
ExecutionMonitor exec)
Load internals into the derived
NodeModel . |
protected void |
loadValidatedSettingsFrom(NodeSettingsRO settings)
Sets new settings from the passed object in the model.
|
protected void |
reset()
Override this function in the derived model and reset your
NodeModel . |
protected void |
saveInternals(File nodeInternDir,
ExecutionMonitor exec)
Save internals of the derived
NodeModel . |
protected void |
saveSettingsTo(NodeSettingsWO settings)
Adds to the given
NodeSettings the model specific
settings. |
protected void |
setHiliteMapping(DefaultHiLiteMapper mapper)
Applies a new mapping to the hilite translator.
|
protected void |
setInHiLiteHandler(int inIndex,
HiLiteHandler hiLiteHdl)
This implementation is empty.
|
protected void |
validateSettings(NodeSettingsRO settings)
Validates the settings in the passed
NodeSettings object. |
addWarningListener, computeFinalOutputSpecs, configure, continueLoop, createInitialStreamableOperatorInternals, createMergeOperator, createStreamableOperator, execute, finishStreamableExecution, getAvailableFlowVariables, getAvailableInputFlowVariables, getCredentialsProvider, getInHiLiteHandler, getInPortType, getInputPortRoles, getInteractiveNodeView, getLogger, getLoopEndNode, getLoopStartNode, getNrInPorts, getNrOutPorts, getOutPortType, getOutputPortRoles, getWarningMessage, iterate, notifyViews, notifyWarningListeners, onDispose, peekFlowVariableDouble, peekFlowVariableInt, peekFlowVariableString, pushFlowVariableDouble, pushFlowVariableInt, pushFlowVariableString, removeWarningListener, resetAndConfigureLoopBody, setWarningMessage, stateChanged
protected static final String CFG_GROUP_BY_COLUMNS
protected static final String CFG_MAX_UNIQUE_VALUES
protected static final String CFG_ENABLE_HILITE
@Deprecated protected static final String CFG_SORT_IN_MEMORY
protected static final String CFG_RETAIN_ORDER
protected static final String CFG_IN_MEMORY
protected static final String CFG_COLUMN_NAME_POLICY
protected static final String CFG_VALUE_DELIMITER
static final String CFG_DATA_TYPE_AGGREGATORS
static final String CFG_PATTERN_AGGREGATORS
public GroupByNodeModel()
public GroupByNodeModel(int ins, int outs)
ins
- number of data input portsouts
- number of data output portsstatic SettingsModelInteger createVersionModel()
@Deprecated protected void inMemoryChanged()
protected void loadInternals(File nodeInternDir, ExecutionMonitor exec) throws IOException
NodeModel
. This method is
only called if the Node
was executed. Read all your
internal structures from the given file directory to create your internal
data structure which is necessary to provide all node functionalities
after the workflow is loaded, e.g. view content and/or hilite mapping.
loadInternals
in class NodeModel
nodeInternDir
- The directory to read from.exec
- Used to report progress and to cancel the load process.IOException
- If an error occurs during reading from this dir.NodeModel.saveInternals(File,ExecutionMonitor)
protected void saveInternals(File nodeInternDir, ExecutionMonitor exec) throws IOException
NodeModel
. This method is
only called if the Node
is executed. Write all your
internal structures into the given file directory which are necessary to
recreate this model when the workflow is loaded, e.g. view content and/or
hilite mapping.saveInternals
in class NodeModel
nodeInternDir
- The directory to write into.exec
- Used to report progress and to cancel the save process.IOException
- If an error occurs during writing to this dir.NodeModel.loadInternals(File,ExecutionMonitor)
protected void saveSettingsTo(NodeSettingsWO settings)
NodeSettings
the model specific
settings. The settings don't need to be complete or consistent. If, right
after startup, no valid settings are available this method can write
either nothing or invalid settings.
Method is called by the Node
if the current settings need
to be saved or transfered to the node's dialog.
saveSettingsTo
in class NodeModel
settings
- The object to write settings into.NodeModel.loadValidatedSettingsFrom(NodeSettingsRO)
,
NodeModel.validateSettings(NodeSettingsRO)
protected void validateSettings(NodeSettingsRO settings) throws InvalidSettingsException
NodeSettings
object.
The specified settings should be checked for completeness and
consistency. It must be possible to load a settings object validated
here without any exception in the
#loadValidatedSettings(NodeSettings)
method. The method
must not change the current settings in the model - it is supposed to
just check them. If some settings are missing, invalid, inconsistent, or
just not right throw an exception with a message useful to the user.validateSettings
in class NodeModel
settings
- The settings to validate.InvalidSettingsException
- If the validation of the settings
failed.NodeModel.saveSettingsTo(NodeSettingsWO)
,
NodeModel.loadValidatedSettingsFrom(NodeSettingsRO)
public static void checkDuplicateAggregators(ColumnNamePolicy namePolicy, List<ColumnAggregator> aggregators) throws IllegalArgumentException
namePolicy
- ColumnNamePolicy
to useaggregators
- List
of ColumnAggregator
to checkIllegalArgumentException
- if the aggregators contain a duplicateprotected void loadValidatedSettingsFrom(NodeSettingsRO settings) throws InvalidSettingsException
#validateSettings(NodeSettings)
method. The model must set
its internal configuration according to the settings object passed.loadValidatedSettingsFrom
in class NodeModel
settings
- The settings to read.InvalidSettingsException
- If a property is not available.NodeModel.saveSettingsTo(NodeSettingsWO)
,
NodeModel.validateSettings(NodeSettingsRO)
protected final void setHiliteMapping(DefaultHiLiteMapper mapper)
mapper
- new hilite mapping, or nullprotected void reset()
NodeModel
. All components should unregister themselves
from any observables (at least from the hilite handler right now). All
internally stored data structures should be released. User settings
should not be deleted/reset though.protected void setInHiLiteHandler(int inIndex, HiLiteHandler hiLiteHdl)
setInHiLiteHandler
in class NodeModel
inIndex
- The index of the input.hiLiteHdl
- The HiLiteHandler
at input index.
May be null
when not available, i.e. not properly
connected.protected HiLiteHandler getOutHiLiteHandler(int outIndex)
HiLiteHandler
for the given output index. This
default implementation simply passes on the handler of input port 0 or
generates a new one if this node has no inputs. getOutHiLiteHandler
in class NodeModel
outIndex
- The output index.HiLiteHandler
for the given output port.protected DataTableSpec[] configure(PortObjectSpec[] inSpecs) throws InvalidSettingsException
PortObjectSpecs
that are defined through the PortTypes
given in the constructor
unless this model is an
InactiveBranchConsumer
(most nodes are not). Similarly, the returned output specs need to comply with
their port types spec class (otherwise an error is reported by the framework). They may also be null (out spec
not known at time of configuration) or
inactive (output and downstream
nodes are inactive).
For a general description of the configure method refer to the description of the specialized
NodeModel.configure(DataTableSpec[])
methods as it addresses more use cases.
configure
in class NodeModel
inSpecs
- The input data table specs. Items of the array could be null if no spec is available from the
corresponding input port (i.e. not connected or upstream node does not produce an output spec). If a
port is of type BufferedDataTable.TYPE
and no spec is available the framework will replace
null by an empty DataTableSpec
(no columns) unless the port is marked as optional as per
constructor.InvalidSettingsException
- If this node can't be configured.protected final DataTableSpec createGroupBySpec(DataTableSpec origSpec, List<String> groupByCols) throws InvalidSettingsException
origSpec
- original input specgroupByCols
- group-by columnsInvalidSettingsException
- if the group-by can't by generated due
to invalid settingsprotected static Collection<String> getExcludeList(DataTableSpec origSpec, List<String> columns)
origSpec
- original spec given at the in-port of this nodecolumns
- to check against input specprotected final List<String> getGroupByColumns()
protected PortObject[] execute(PortObject[] inData, ExecutionContext exec) throws Exception
inObjects
represent the input objects and the
returned array represents the output objects. The elements in the argument array are generally guaranteed to be
not null and subclasses of the PortObject classes
that are defined through the
PortTypes
given in the constructor
.
Similarly, the returned output objects need to comply with their port types object class (otherwise an error is
reported by the framework) and must not be null. There are few exceptions to these rules:
InactiveBranchConsumer
may find instances of InactiveBranchPortObject
in
case the corresponding input is inactive.InactiveBranchPortObject
elements in case the output should be
inactivated.corresponding flags
.
For a general description of the execute method refer to the description of the specialized
NodeModel.execute(BufferedDataTable[], ExecutionContext)
methods as it addresses more use cases.
execute
in class NodeModel
inData
- The input objects.exec
- For BufferedDataTable
creation and progress.Exception
- If the node execution fails for any reason.protected final GroupByTable createGroupByTable(ExecutionContext exec, BufferedDataTable table, List<String> groupByCols) throws CanceledExecutionException
exec
- execution contexttable
- input table to groupgroupByCols
- column selected for group-by operationCanceledExecutionException
- if the group-by table generation was
canceled externallyprotected final GroupByTable createGroupByTable(ExecutionContext exec, BufferedDataTable table, List<String> groupByCols, boolean inMemory, boolean retainOrder, List<ColumnAggregator> aggregators) throws CanceledExecutionException
exec
- execution contexttable
- input table to groupgroupByCols
- column selected for group-by operationinMemory
- keep data in memoryretainOrder
- reconstructs original data orderaggregators
- column aggregation to useCanceledExecutionException
- if the group-by table generation was
canceled externally@Deprecated protected final GroupByTable createGroupByTable(ExecutionContext exec, BufferedDataTable table, List<String> groupByCols, boolean inMemory, boolean sortInMemory, boolean retainOrder, List<ColumnAggregator> aggregators) throws CanceledExecutionException
exec
- execution contexttable
- input table to groupgroupByCols
- column selected for group-by operationinMemory
- keep data in memorysortInMemory
- does sorting in memoryretainOrder
- reconstructs original data orderaggregators
- column aggregation to useCanceledExecutionException
- if the group-by table generation was
canceled externallycreateGroupByTable(ExecutionContext, BufferedDataTable, List,
boolean, boolean, List)
protected GlobalSettings createGlobalSettings(ExecutionContext exec, BufferedDataTable table, List<String> groupByCols, int maxUniqueVals)
GlobalSettings
object that is passed to all
AggregationMethod
s.exec
- the ExecutionContext
table
- the BufferedDataTable
groupByCols
- the names of the columns to group bymaxUniqueVals
- the maximum number of unique values per groupGlobalSettings
object to useprotected String getDefaultValueDelimiter()
protected boolean isRetainOrder()
true
if the row order should be retainedprotected boolean isProcessInMemory()
true
if all operations should be processed in
memory@Deprecated protected boolean isSortInMemory()
true
if any sorting should be performed in memoryprotected List<ColumnAggregator> getColumnAggregators()
protected ColumnNamePolicy getColumnNamePolicy()
protected static ColumnNamePolicy compGetColumnNamePolicy(NodeSettingsRO settings)
ColumnNamePolicy
for the old node
settings.settings
- the settings to read the old column name policy fromColumnNamePolicy
equivalent to the old settingprotected static List<ColumnAggregator> compGetColumnMethods(DataTableSpec spec, List<String> excludeCols, ConfigRO config)
spec
- the input DataTableSpec
excludeCols
- the columns that should be excluded from the aggregation
columnsconfig
- the config object to read fromColumnAggregator
spublic static List<ColumnAggregator> getAggregators(DataTableSpec inputSpec, Collection<String> groupColumns, List<ColumnAggregator> columnAggregators, Collection<PatternAggregator> patternAggregators, Collection<DataTypeAggregator> dataTypeAggregators, List<ColumnAggregator> invalidColAggrs)
List
with all ColumnAggregator
s to use based on the given input settings.
Columns are only added once for the different aggregator types in the order they are added to the function
e.g. all column that are handled by one of the given ColumnAggregator
are ignored by the
pattern and data type based aggregator all columns that are handled by one of the pattern based aggregators
is ignored by the data type based aggregators.inputSpec
- the DataTableSpec
of the input tablegroupColumns
- the columns to group bycolumnAggregators
- the manually added ColumnAggregator
spatternAggregators
- the PatternAggregator
sdataTypeAggregators
- the DataTypeAggregator
sinvalidColAggrs
- empty List
that is filled with the invalid column aggregators can be
null
ColumnAggregator
s to use based on the given aggregator
KNIME GmbH, Konstanz, Germany
You may not modify, publish, transmit, transfer or sell, reproduce, create derivative works from, distribute, perform, display, or in any way exploit any of the content, in whole or in part, except as otherwise expressly permitted in writing by the copyright owner or as specified in the license file distributed with this product.