oracle.dmt.odm.settings.algorithm
Class KMeansAlgorithmSettings

java.lang.Object
  |
  +--oracle.dmt.odm.MiningObject
        |
        +--oracle.dmt.odm.settings.algorithm.MiningAlgorithmSettings
              |
              +--oracle.dmt.odm.settings.algorithm.ClusteringAlgorithmSettings
                    |
                    +--oracle.dmt.odm.settings.algorithm.KMeansAlgorithmSettings
All Implemented Interfaces:
java.io.Serializable

public class KMeansAlgorithmSettings
extends ClusteringAlgorithmSettings

An instance of KMeansAlgorithmSettings is used to specify settings for the KMeans clustering algorithm. It allows a knowledgeable user to fine tune algorithm parameters. Generally, not all parameters must be specified, however, those specified are taken into account by the underlying DMS. ODM 9.2.0 implements a hierarchical version of the K-Means algorithm. The tree is grown one node at the time. The node with the largest distortion (sum of distance to the node's centroid) is split to increase the size of the tree until the desired number of clusters is reached.

Since:
9.2.0
See Also:
Serialized Form

Constructor Summary
KMeansAlgorithmSettings(float error, DistanceFunction distanceFunction)
          Creates a KMeansAlgorithmSettings object with the minimum percentual change in error between K-Means iterations to considered that K-Means has converged set to minErrorTolerance and the distance function to be used to train a K-Means set to distanceFunction minErrorTolerance is a number between 0 and 1.
KMeansAlgorithmSettings(int iterations, DistanceFunction distanceFunction)
          Creates a KMeansAlgorithmSettings object with the maximum number of K-Means iterations between splits set to iterations and the distance function to be used to train a K-Means set to distanceFunction.
KMeansAlgorithmSettings(int iterations, float error, DistanceFunction distanceFunction)
          Creates a KMeansAlgorithmSettings object with the maximum number of K-Means iterations between splits set to iterations, the minimum percentual change in error between K-Means iterations set to error, and the distance function to be used to train a K-Means set to distanceFunction.
 
Method Summary
TypeMethod
 DistanceFunction getDistanceFunction()
          Returns the DistanceFunction specified by a KMeansAlgorithmSettings object to train a K-Means ClusteringModel.
 int getMaxNumberOfIterations()
          Returns the maxNumberOfIterations specified by a KMeansAlgorithmSettings object to train a K-Means ClusteringModel.
 float getMinimumErrorTolerance()
          Returns the minimumErrorTolerance specified by a KMeansAlgorithmSettings object to train a K-Means ClusteringModel.
 ClusteringStoppingCriterion getStopCriterion()
          Returns the clusteringStoppingCriterion specified by a KMeansAlgorithmSettings object to train a K-Means ClusteringModel.
 void setDistanceFunction(DistanceFunction distanceFunction)
          Sets the distance function to be used to train a K-Means ClusteringModel.
 void setMaxNumberOfIterations(int maxIter)
          Sets the maximum number of K-Means iterations between splits while training a K-Means ClusteringModel.
 void setMinErrorTolerance(float minError)
          Sets the minimum percentual change in error between K-Means iterations to consider that K-Means has converged.
 void setStopCriterion(ClusteringStoppingCriterion stopCriterion)
          Sets the StopCriterion to be used to train a K-Means ClusteringModel.
 
Methods inherited from class oracle.dmt.odm.settings.algorithm.MiningAlgorithmSettings
getMiningAlgorithm, getMiningAlgorithmName
 
Methods inherited from class java.lang.Object
equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

KMeansAlgorithmSettings

public KMeansAlgorithmSettings(int iterations,
                               DistanceFunction distanceFunction)
                        throws InvalidArgumentException
Creates a KMeansAlgorithmSettings object with the maximum number of K-Means iterations between splits set to iterations and the distance function to be used to train a K-Means set to distanceFunction. Training stops after a maximum number of iterations over the data in the buffer is greater than iterations. iterations is a number between 1 and 100.
Parameters:
iterations - Maximum number of K-Means iterations between splits. Recommended value: 7
Higher values take longer, but can produse higher quality models.
distanceFunction - Distance function
Throws:
InvalidArgumentException - is thrown
- when iterations > 100 or iterations < 1
- when distanceFunction is null

KMeansAlgorithmSettings

public KMeansAlgorithmSettings(float error,
                               DistanceFunction distanceFunction)
                        throws InvalidArgumentException
Creates a KMeansAlgorithmSettings object with the minimum percentual change in error between K-Means iterations to considered that K-Means has converged set to minErrorTolerance and the distance function to be used to train a K-Means set to distanceFunction minErrorTolerance is a number between 0 and 1. Training stops after the change in error between two consecutive iterations is less than error.
Parameters:
error - Minimum error tolerance. Recommended value: 0.05
Setting closer to .005 builds them model more slowly, but to .01 builds model faster, but perhaps with less accuracy.
distanceFunction - Distance function
Throws:
InvalidArgumentException - is thrown
- when error > 100 or error < 1
- when distanceFunction is null

KMeansAlgorithmSettings

public KMeansAlgorithmSettings(int iterations,
                               float error,
                               DistanceFunction distanceFunction)
                        throws InvalidArgumentException
Creates a KMeansAlgorithmSettings object with the maximum number of K-Means iterations between splits set to iterations, the minimum percentual change in error between K-Means iterations set to error, and the distance function to be used to train a K-Means set to distanceFunction. iterations is a number between 1 and 100. error is a number between 0 and 1. Training stops after either the change in error between two consecutive iterations is less than error or the maximum number of iterations over the data in the buffer is greater than iterations.
Parameters:
error - Minimum error tolerance. Recommended value: 0.05.
iterations - Maximum number of K-Means iterations between splits. Recommended value: 7.
distanceFunction - Distance function.
Throws:
InvalidArgumentException - is thrown
- when error > 100 or error < 1
- when iterations > 100 or iterations < 1
- when distanceFunction is null
Method Detail

getMinimumErrorTolerance

public float getMinimumErrorTolerance()
Returns the minimumErrorTolerance specified by a KMeansAlgorithmSettings object to train a K-Means ClusteringModel. The minimumErrorTolerance setting controls the minimum percentual change in error between K-Means iterations to considered that K-Means has converged. minimumErrorTolerance is a number between 0 and 1.
Returns:
float - Minimum error tolerance

getDistanceFunction

public DistanceFunction getDistanceFunction()
Returns the DistanceFunction specified by a KMeansAlgorithmSettings object to train a K-Means ClusteringModel.
Returns:
DistanceFunction

getStopCriterion

public ClusteringStoppingCriterion getStopCriterion()
Returns the clusteringStoppingCriterion specified by a KMeansAlgorithmSettings object to train a K-Means ClusteringModel.
Returns:
ClusteringStoppingCriterion

getMaxNumberOfIterations

public int getMaxNumberOfIterations()
Returns the maxNumberOfIterations specified by a KMeansAlgorithmSettings object to train a K-Means ClusteringModel. The maxNumberOfIterations setting controls the maximum number of K-Means iterations between splits while training a K-Means ClusteringModel. maxNumberOfIterations is a number between 1 and 100.
Returns:
int - Maximum number of iterations between splits

setMaxNumberOfIterations

public void setMaxNumberOfIterations(int maxIter)
                              throws InvalidArgumentException
Sets the maximum number of K-Means iterations between splits while training a K-Means ClusteringModel. maxIter is a number between 1 and 100.
Parameters:
maxIter - Maximum number of iterations
Throws:
InvalidArgumentException - is thrown
- when maxIter > 100 or maxIter < 1

setMinErrorTolerance

public void setMinErrorTolerance(float minError)
                          throws InvalidArgumentException
Sets the minimum percentual change in error between K-Means iterations to consider that K-Means has converged. minError is a number between 0 and 1.
Parameters:
minError - Minimum percentual change in error
Throws:
InvalidArgumentException - is thrown
- when minError > 1 or minError < 0

setDistanceFunction

public void setDistanceFunction(DistanceFunction distanceFunction)
Sets the distance function to be used to train a K-Means ClusteringModel.
Parameters:
distanceFunction - Distance function used to train a K-Means ClusteringModel

setStopCriterion

public void setStopCriterion(ClusteringStoppingCriterion stopCriterion)
Sets the StopCriterion to be used to train a K-Means ClusteringModel.
Parameters:
stopCriterion - Stop criterion used to train a K-Means ClusteringModel