abstract class %DeepSee.PMML.Dataset
extends %RegisteredObject
A Dataset is a wrapper for a collection of records that can be analyzed, in order to build or
run a model. Implementations abstracting different sources of data can be found in
%DeepSee.PMML.Dataset.
method Clear()
as %Status
Clears all temporary structures created by this object.
The dataset should remain usable after calling this method!
abstract method Get1DDistribution(pField As %String, Output pDistribution, ByRef pFilters)
as %Status
Returns an array describing the distribution of values for a field pField (categorical)
accepts
pFilters(n) = $lb(field, operator, key)
returns:
pDistribution("total") = tTotalCount
pDistribution(n) = $lb(value, count)
abstract method GetAggregatesByCategory(pContField As %String, pCatField As %String, Output pAggregates, ByRef pFilters)
as %Status
Returns an array listing aggregate values for a continuous field pContField for
each value of a categorical field pCatField.
accepts
pFilters(n) = $lb(field, operator, key)
returns:
pAggregates("total") = tTotalCount
pAggregates(n) = $lb(category value, count, average, sum, max, min, countNonNull)
method GetFieldBySpec(pFieldSpec As %String)
as %DeepSee.PMML.Dataset.Field
abstract method GetRecordIds(Output pIds, ByRef pFilters)
as %Status
returns pIds(n) = rowid
abstract method GetValueCount(pField As %String, pIncludeNull As %Boolean = 1, ByRef pFilters, Output pSC As %Status)
as %Integer
Returns the number of distinct values for pField (categorical)
abstract method GetXDDistribution(pFields As %List, Output pDistribution, ByRef pFilters)
as %Status
accepts
pFilters(n) = $lb(field, operator, key)
returns:
pDistribution = $lb(dim1Count, dim2Count, ...)
pDistribution("value", dim, i) = value
pDistribution(i, j, ...) = tCount
pDistribution("total", dim, i) = tDimTotal
method HasField(pFieldName As %String, Output pSC As %String)
as %Boolean