The settings for the Support Vector Machine (SVM) algorithm depends on the kernel that you select.
The default settings are designed so that they should work well for most cases.
Click Restore to restore the default values.
When you are done specifying algorithm settings, click OK to continue.
SVM supports two kernel functions: Linear and Gaussian. You can pick one of the kernel functions or you can let the system determine the kernel function. The default is to let the system determine the kernel function.
If you let the system determine the kernel function, the system will also determine all other settings.
For more information, see SVM Kernel Functions.
If you specify the Linear Kernel, you can change the following settings:
If you specify the Gaussian kernel, you can change the following settings:
SVM models grow as the size of the training data set increases. This property limits SVM models to small and medium size build data sets (less than 100,000 cases). Active learning provides a way to deal with large build data sets.
Active learning forces the SVM algorithm to restrict learning to the most informative examples and not to attempt to use the entire body of data. In most cases, the resulting models have predictive accuracy comparable to that of the standard (exact) SVM model.
Active learning is on by default. It can be turned off.
If you select the Gaussian kernel, you can specify the size for the cache used for storing computed kernels during the build operation. The default size is 50 megabytes.
The most expensive operation in building a Gaussian SVM model is the computation of kernels. The general approach taken to build is to converge within a chunk of data at a time, then to test for violators outside of the chunk. Build is complete when there are no more violators within tolerance. The size of the chunk is chosen such that the associated kernels can be maintained in memory in a "Kernel Cache". The larger the chunk size, the better the chunk represents the population of training data and the fewer number of times new chunks will need to be created. Generally, larger caches imply faster builds.
You specify the complexity factor for an SVM model by clicking the radio button next to Yes in answer to the question Do you want to specify a complexity factor for predictor loss?
The complexity factor determines the trade-off between minimizing model error on the training data and minimizing model complexity. Its responsibility is to avoid over-fit (an over-complex model fitting noise in the training data) and under-fit (a model that is too simple).
A very large value of the complexity factor places an extreme penalty on errors, forcing SVM to seek a perfect separation of target classes. A small value for the complexity factor places a low penalty on errors and high constraints on the model parameters, which can lead to under-fit.
The default is to specify no complexity factor, in which case the system calculates a complexity factor. If you do specify a complexity factor, specify a positive number.
Outlier Rate is the approximate rate of outliers (negative predictions) produced by a one-class SVM model on the training data. Outlier Rate is a number > 0 and <= 1; the default value is 0.05.
If you select the Gaussian kernel, you can specify the standard deviation of the Gaussian kernel. This value must be a positive number. The default is to not specify the standard deviation.
Tolerance value is the maximum size of a violation of convergence criteria such that the model is considered to have converged. The default value is 0.001. Larger values result in faster building but less accurate models.
Copyright © 2005, Oracle. All rights reserved.