public class InterquartileRange extends SimpleBatchFilter
-D Turns on output of debugging information.
-R <col1,col2-col4,...> Specifies list of columns to base outlier/extreme value detection on. If an instance is considered in at least one of those attributes an outlier/extreme value, it is tagged accordingly. 'first' and 'last' are valid indexes. (default none)
-O <num> The factor for outlier detection. (default: 3)
-E <num> The factor for extreme values detection. (default: 2*Outlier Factor)
-E-as-O Tags extreme values also as outliers. (default: off)
-P Generates Outlier/ExtremeValue pair for each numeric attribute in the range, not just a single indicator pair for all the attributes. (default: off)
-M Generates an additional attribute 'Offset' per Outlier/ExtremeValue pair that contains the multiplier that the value is off the median. value = median + 'multiplier' * IQR Note: implicitely sets '-P'. (default: off)Thanks to Dale for a few brainstorming sessions.
Modifier and Type | Class and Description |
---|---|
static class |
InterquartileRange.ValueType
enum for obtaining the various determined IQR values.
|
Modifier and Type | Field and Description |
---|---|
static int |
NON_NUMERIC
indicator for non-numeric attributes
|
Constructor and Description |
---|
InterquartileRange() |
Modifier and Type | Method and Description |
---|---|
java.lang.String |
attributeIndicesTipText()
Returns the tip text for this property
|
java.lang.String |
detectionPerAttributeTipText()
Returns the tip text for this property
|
java.lang.String |
extremeValuesAsOutliersTipText()
Returns the tip text for this property
|
java.lang.String |
extremeValuesFactorTipText()
Returns the tip text for this property
|
java.lang.String |
getAttributeIndices()
Gets the current range selection
|
Capabilities |
getCapabilities()
Returns the Capabilities of this filter.
|
boolean |
getDetectionPerAttribute()
Gets whether an Outlier/ExtremeValue attribute pair is generated for each
numeric attribute ("true") or just one pair for all numeric attributes
together ("false").
|
boolean |
getExtremeValuesAsOutliers()
Get whether extreme values are also tagged as outliers.
|
double |
getExtremeValuesFactor()
Gets the factor for determining the thresholds for extreme values.
|
java.lang.String[] |
getOptions()
Gets the current settings of the filter.
|
double |
getOutlierFactor()
Gets the factor for determining the thresholds for outliers.
|
boolean |
getOutputOffsetMultiplier()
Gets whether an additional attribute "Offset" is generated per
Outlier/ExtremeValue attribute pair that lists the multiplier the value is
off the median: value = median + 'multiplier' * IQR.
|
java.lang.String |
getRevision()
Returns the revision string.
|
double[] |
getValues(InterquartileRange.ValueType type)
Returns the values for the specified type.
|
java.lang.String |
globalInfo()
Returns a string describing this filter
|
java.util.Enumeration<Option> |
listOptions()
Returns an enumeration describing the available options.
|
static void |
main(java.lang.String[] args)
Main method for testing this class.
|
java.lang.String |
outlierFactorTipText()
Returns the tip text for this property
|
java.lang.String |
outputOffsetMultiplierTipText()
Returns the tip text for this property
|
void |
setAttributeIndices(java.lang.String value)
Sets which attributes are to be used for interquartile calculations and
outlier/extreme value detection (only numeric attributes among the
selection will be used).
|
void |
setAttributeIndicesArray(int[] value)
Sets which attributes are to be used for interquartile calculations and
outlier/extreme value detection (only numeric attributes among the
selection will be used).
|
void |
setDetectionPerAttribute(boolean value)
Set whether an Outlier/ExtremeValue attribute pair is generated for each
numeric attribute ("true") or just one pair for all numeric attributes
together ("false").
|
void |
setExtremeValuesAsOutliers(boolean value)
Set whether extreme values are also tagged as outliers.
|
void |
setExtremeValuesFactor(double value)
Sets the factor for determining the thresholds for extreme values.
|
void |
setOptions(java.lang.String[] options)
Parses a list of options for this object.
|
void |
setOutlierFactor(double value)
Sets the factor for determining the thresholds for outliers.
|
void |
setOutputOffsetMultiplier(boolean value)
Set whether an additional attribute "Offset" is generated per
Outlier/ExtremeValue attribute pair that lists the multiplier the value is
off the median: value = median + 'multiplier' * IQR.
|
allowAccessToFullInputFormat, batchFinished, input
setInputFormat
batchFilterFile, debugTipText, doNotCheckCapabilitiesTipText, filterFile, getCapabilities, getDebug, getDoNotCheckCapabilities, getOutputFormat, isFirstBatchDone, isNewBatch, isOutputFormatDefined, makeCopies, makeCopy, mayRemoveInstanceAfterFirstBatchDone, numPendingOutput, output, outputPeek, runFilter, setDebug, setDoNotCheckCapabilities, toString, useFilter, wekaStaticWrapper
public static final int NON_NUMERIC
public java.lang.String globalInfo()
globalInfo
in class SimpleFilter
public java.util.Enumeration<Option> listOptions()
listOptions
in interface OptionHandler
listOptions
in class Filter
public void setOptions(java.lang.String[] options) throws java.lang.Exception
-D Turns on output of debugging information.
-R <col1,col2-col4,...> Specifies list of columns to base outlier/extreme value detection on. If an instance is considered in at least one of those attributes an outlier/extreme value, it is tagged accordingly. 'first' and 'last' are valid indexes. (default none)
-O <num> The factor for outlier detection. (default: 3)
-E <num> The factor for extreme values detection. (default: 2*Outlier Factor)
-E-as-O Tags extreme values also as outliers. (default: off)
-P Generates Outlier/ExtremeValue pair for each numeric attribute in the range, not just a single indicator pair for all the attributes. (default: off)
-M Generates an additional attribute 'Offset' per Outlier/ExtremeValue pair that contains the multiplier that the value is off the median. value = median + 'multiplier' * IQR Note: implicitely sets '-P'. (default: off)
setOptions
in interface OptionHandler
setOptions
in class Filter
options
- the list of options as an array of stringsjava.lang.Exception
- if an option is not supportedpublic java.lang.String[] getOptions()
getOptions
in interface OptionHandler
getOptions
in class Filter
public java.lang.String attributeIndicesTipText()
public java.lang.String getAttributeIndices()
public void setAttributeIndices(java.lang.String value)
value
- a string representing the list of attributes. Since the string
will typically come from a user, attributes are indexed from 1. java.lang.IllegalArgumentException
- if an invalid range list is suppliedpublic void setAttributeIndicesArray(int[] value)
value
- an array containing indexes of attributes to work on. Since
the array will typically come from a program, attributes are
indexed from 0.java.lang.IllegalArgumentException
- if an invalid set of ranges is suppliedpublic java.lang.String outlierFactorTipText()
public void setOutlierFactor(double value)
value
- the factor.public double getOutlierFactor()
public java.lang.String extremeValuesFactorTipText()
public void setExtremeValuesFactor(double value)
value
- the factor.public double getExtremeValuesFactor()
public java.lang.String extremeValuesAsOutliersTipText()
public void setExtremeValuesAsOutliers(boolean value)
value
- whether or not to tag extreme values also as outliers.public boolean getExtremeValuesAsOutliers()
public java.lang.String detectionPerAttributeTipText()
public void setDetectionPerAttribute(boolean value)
value
- whether or not to generate indicator attribute pairs for each
numeric attribute.public boolean getDetectionPerAttribute()
public java.lang.String outputOffsetMultiplierTipText()
public void setOutputOffsetMultiplier(boolean value)
value
- whether or not to generate the additional attribute.public boolean getOutputOffsetMultiplier()
public Capabilities getCapabilities()
getCapabilities
in interface CapabilitiesHandler
getCapabilities
in class Filter
Capabilities
public double[] getValues(InterquartileRange.ValueType type)
type
- the type of values to returnpublic java.lang.String getRevision()
getRevision
in interface RevisionHandler
getRevision
in class Filter
public static void main(java.lang.String[] args)
args
- should contain arguments to the filter: use -h for help