Receiver Operating Characteristics

Receiver Operating Characteristics (ROC) analysis is a useful method for evaluating classification models. ROC curves provide a means to compare individual models and determine thresholds which yield a high proportion of positive hits.

ROC curves are similar to lift charts in that they provide a means of comparison between individual models and determine thresholds which yield a high proportion of positive hits. ROC was originally used in signal detection theory to gauge the true hit versus false alarm ratio when sending signals over a noisy channel.

The horizontal axis of an ROC graph measures the false positive rate as a percentage. The vertical axis shows the true positive rate. The top left hand corner is the optimal location in an ROC curve, indicating high true-positive rate versus low false-positive rate. The area under the ROC curve measures the discriminating ability of a binary classification model. The larger the area under the curve, the higher the likelihood that an actual positive case will be assigned a higher probability of being positive than an actual negative case. The area under the curve measure is especially useful for data sets with unbalanced target distribution (one target class dominates the other).

ROC also helps to determine a threshold value to achieve an acceptable trade-off between hit (true positives) rate and false alarm (false positives) rate. By selecting a point on the curve for a given model, a given trade-off is achieved. This threshold can then be used as a post-processing parameter for achieving the desired performance with respect to the error rates. ODM models by default use a threshold of 0.5.