Mining Activity Overview

A mining activity provides a step-by-step guide to model build or model apply.

The following are discussed:

Kinds of Activities

There are three kinds of mining activities:

You can create one or more Mining Activities at any time during an Oracle Data Miner session.

Create a mining activity using the Activity menu.

Build Activity

To create a Build Activity, select Activity | Build. Before the activity can start, you must specify the mining activity type, select the case table or view to be used for building the model, and the name of the activity. Depending on the activity selected, you may have a choice of algorithms for example, if you select the Classification Function Type, you can select one of the following algorithms: Decision Tree (the default), Naive Bayes, Adaptive Bayes Network, or Support Vector Machine. Once you select an algorithm for an activity, you cannot change it. You can also specify which attributes of the case table to include in the model. If the algorithm that you selected requires a target, you must specify a target and target settings.

When you click Finish, the wizard creates a new mining activity with the specified name and displays the activity in the right pane. For example, a Naive Bayes Mining Activity has Sample, Discretize, Split, Build, and Test Metrics steps. The defaults in each step are the appropriate ones for building a Naive Bayes model.

Apply Activity

A model is usually used to make predictions by applying it to new data, that is, the model is used to score new data. Note that not all models can be applied. When you specify an Apply Activity, you must select a model that can be applied.

To create an Apply Activity, select Activity | Apply.

Before the activity can be defined, you must specify the name of the activity, select the table or view containing the data to apply, and either the Build Activity used to create the model or a Standalone Model. Build Activities are arranged according to type (Anomaly Detection, Classification, Clustering, Feature Extraction, or Regression) and name.

Before you can apply a model to new data, the data must be prepared in the same way as the data used to build the model. For example, if the data used to build the model was normalized, the new data must be normalized in the same way. If you specified a Build Activity, the Apply Activity automatically performs the data preparation of the new data based on the data preparation of the Build Activity. If you specified a standalone model, you must ensure that the new data is correctly prepared. Data preparation can include indexing text columns, discretization, and normalization. The Apply Activity applies the model to the prepared data. For example, a Naive Bayes Mining Apply Activity for a model that was created using a Build Activity has two steps, Discretize and Apply. The discretization is done in the same way that it was done to build the model.

Test Activity

A model is tested using a data set for which the target is known. You apply the model to the data and then compare the predicted values to the know values. Data used to test a model must be prepared in the same way as the data used to build the model. You must ensure that data used to test standalone models is properly prepared.

You can create Test Activities for Classification and Regression models only.

Before the activity can be defined, you must specify the name of the activity, select the table or view containing the data to apply, and either the Build Activity used to create the model or a Standalone Model. Build Activities are arranged according to type (Classification or Regression) and name.

Before you can test a model, the data used for testing must be prepared in the same way as the data used to build the model. For example, if the data used to build the model was discretized (binned), the new data must be discretized in the same way. If you specified a Build Activity, the Test Activity automatically performs the data preparation of the new data based on the data preparation of the Build Activity. If you specified a standalone model, you must ensure that the new data is correctly prepared. Data preparation can include indexing text columns, discretization, and normalization. The Test Activity applies the model to the prepared data, compares the predicted values with the actual values, and summarizes the results. For example, a Naive Bayes Mining Test Activity for a model that was created using a Build Activity has two steps, Discretize and Test Metrics. The discretization is done in the same way that it was done to build the model.

Execute an Activity

The steps of an activity must be executed in order; the first step is the one closest to the top of the window. Steps can be skipped.

There are many ways to execute an activity, including

You can change options for a step that has not been executed. Just click Options and make the changes. If a step has executed, then you can view the options. If you need to change options for a step that has executed, you must reset the step.

Stop an Activity

When an activity is running, the Run Activity button turns into a Stop button. To stop the activity, click Stop. The activity terminates at the next opportunity, that is, it completes the currently running step. You can restart the activity by clicking Run Activity again. Execution picks up from where you left off.

Reset Steps

You can reset steps that have already been executed. If you click Reset for a particular step, all succeeding steps are also restarted. For example, suppose that you decide that the binning in a classification model build was not correct. Click Reset for the Discretize step. All later steps are automatically reset. Make changes to binning by changing the options. Then click Run Activity to execute the reset steps.

Steps in an Activity

A mining activity is a collection of Steps. The steps are displayed in an activity display. You can start an activity, interrupt it at any time between the steps, and finish it at a later time. Steps must be performed in order, but steps can be skipped. Activity progress is maintained by keeping track of steps that have been completed or skipped.

An activity has two kinds of steps: optional steps and required steps. Optional steps are not required; for example, Sample is an optional step. If you execute an activity without performing an optional step, that step is grayed out. A required step has a check next to its name. You can change steps from required to optional by clicking the checkbox next to the name of the step.

Omitting steps such as discretization or normalization may have a significant impact on the model.

The steps in an activity have carefully selected defaults. These defaults were chosen to return good results in most cases. An activity also restricts choices to avoid errors. An activity can override defaults, such as those specified in Data Mining Preferences. To see the options for a step, click Options; you can change these options.

Each step invokes a wizard to do the required work. To start the wizard, click Run Activity in the step. For example, if you click Run Activity in the Split step, the Split Transformation wizard starts. To complete the step, go through the wizard steps as necessary.

Some steps, such as Build, spawn a task. The spawned task is displayed in the Active Tasks list. To monitor the executing task, right-click the task name (which is the activity name) in the tasks list and select View Task. The task can also be stopped.

After a step is completed, the output of the step (new table or view, model, test metrics, etc.) is displayed. Completed steps are marked with a check mark and the phrase Completed; skipped steps are marked with a "Skipped" icon and the phrase Skipped.

A completed step can be reset to its uncompleted state by clicking Reset; steps after a reset step are also reset. When you reset a step, you are asked if you wish to remove the mining objects and data sources created by the current step and all subsequent steps.

All created mining activities are displayed in the ODM navigator tree; to see a list of defined activities, expand the Mining Activity node. The activities are listed according to model type (Attribute Importance, Association, Classification, Regression, Feature Extraction, or Clustering). To view a particular activity, click its name. To delete an activity, right-click the name of the activity, and select Delete from the context menu. If you right-click a build activity, you can create an apply activity for the model by selecting Apply Activity ... from the context menu.

Output of an Activity

Some steps generate output data or results when they complete. For example, the Split step generates two tables or views, one for build and one for test. The Build step generates output data and results (the model). To view output or results, click the link.