Oracle Data Miner 10.2.0.1 Beta
October 2005
Table of Contents
What’s New in Oracle Data Miner?
What's New in Oracle Data Mining?
ODM Documentation
Oracle Data Miner Tutorial
How to Start Oracle Data Miner
Define a Database
Connection
Oracle Data Miner Install
and Uninstall
Oracle
Data Miner Requirements
Text
Mining Requirements
Publish
to OracleBI Discoverer Requirements
Upgrading
Oracle Data Miner
Install on
Microsoft Windows
Install on UNIX
or Linux
Uninstall
Oracle Data Miner Notes
Oracle Data Miner Bugs
Oracle Data Miner 10.2.0.1.0 is the user interface to Oracle Data Mining
(ODM) 10.2. Oracle Data Miner replaces all previous user interfaces to
ODM, including Oracle Data Miner 10.1 and Data Mining for Java (DM4J).
Oracle Data Miner 10.1 and DM4J cannot be used with ODM 10.2. This document
provides a brief overview of new features of ODM and Oracle Data Miner
along with installation instructions.
NOTE: ODM 10.2 Java Code Generator Extension is not available.
Support and Feedback
If you encounter problems when using Oracle Data Miner, you can report
them to Oracle Support (requires current product support contract) at Oracle
MetaLink.
You can post general comments and suggestions to the Data
Mining Discussion Forum on Oracle Technology Network.
What’s New in Oracle Data Miner?
Oracle Data Miner 10.2.0.1.0 targets data analysts more directly than previous
releases. Oracle Data Miner is designed to increase the analyst’s success
rate in properly utilizing ODM algorithms. These goals are addressed in
several ways:
-
Users need more assistance in applying a methodology that addresses both
data preparation and algorithm selection. Oracle Data Miner meets this
need by providing Data Mining Activities to step users through the proper
methodology.
-
Oracle Data Miner includes improved and expanded heuristics in the model
building and transformation wizards to reduce the chance of error in specifying
model and transformation settings.
Oracle Data Miner also supports the new ODM 10.2 features, as described
in What's New in Oracle Data Mining?.
Users can use the native import/export facilities to move mining objects
to other schemas. This is often a required step during deployment. Model
import/export is not supported by Oracle Data Miner; it is supported by
the ODM Java and PL/SQL programmatic interfaces. For information, see Oracle
Data Mining Administrator's Guide.
The rest of this section briefly describes the following new features
of Oracle Data Miner 10.2:
-
Support of ODM 10.2 Java interface
-
Mining activity redesign, with improved heuristics
-
Support for data specified as joins to a case table
-
Support for mixed case names
-
Receiver Operating Characteristics (ROC) for classification model evaluation
-
Residuals Plots for regression model evaluation
-
Publish data mining results to OracleBI Discoverer
-
Predict and Explain automated data mining
Mining Activity Redesign
A Mining Activity is a step-by-step guide for model build, apply, or test.
A Build Activity outlines steps for data preparation, model build, and
model test, where appropriate; the exact steps depend on the algorithm
selected. An Apply Activity guides users through data preparation and model
apply for a model created using a Build Activity or a model created outside
of Data Miner using either the ODM 10.2 Java or PL/SQL API. A Test Activity
guides users through data preparation and model test for a model created
using a Build Activity or a model created outside of Data Miner using either
the ODM 10.2 Java or PL/SQL API.
Mining Activities store metadata about model creation to simplify processing.
For example, data that a model is applied to must be prepared in the same
way as the data used to build the model. An Apply Activity uses the metadata
from the Build Activity to prepare the data correctly for the apply operation.
The Mining Activities for Oracle Data Miner 10.2 have improved heuristics,
especially for the data preparation steps.
Mining Activities are now available from the Activity menu. You
can select a Build activity, an Apply activity, or a Test activity.
Mining Activities support the new functionality of Oracle Data Mining
10.2. For example, there are anomaly detection activities and classification
activities that use the Decision tree algorithm.
You can run an activity in several ways:
-
Run the activity as soon as the activity definition is complete.
-
Run the activity step by step.
-
Complete the activity by clicking Start; this option runs all remaining
steps of the activity.
Support for Data Specified as Joins to a Case Table
You can specify data as joins to a case table for input to a mining activity.
The joins can be one-to-one or one-to-many. One-to-many joins support transactional
data.
Support for Mixed Case Names
Names of tables and views, clolumns in tables and views, schemas, and mining
objects except for model names and test metrics names may have mixed case.
If you want a name to have both uppercase and lowercase letters in it,
enclose the name in double quotation marks (").
Model names and test metric names, however, are restricted to 25 or
fewer characters and must be uppercase letters.
Improved Transformation Wizards
Oracle Data Miner has revised versions of the Sample and Stratified Sample
transformation wizards that are easier to use.
New Model and Results Viewers
Model and results viewers were redesigned, and new viewers were created
for new models and results. The new model viewers are the Decision Tree
Viewer, the Residual Plot Viewer, and the Anomaly Detection Viewer. The
new results viewers are viewers for Predict and Explain, the new ROC viewer,
and combined test metric viewers. The new Association Rules viewer incorporates
lookup on item code. Model viewers now support automatic model transparency:
a model is built on transformed data, so the results must be converted
based on transformations so that they reflect values in the original input
data.
Model Evaluation
Oracle Data Miner supports Receiver Operator Characteristics (ROC) for
classification models. Oracle Data Miner also supports Test activities.
Receiver Operating Characteristics (ROC) analysis is a useful method
for evaluating classification models. ROC curves can be used to compare
individual models and to determine thresholds which yield a high proportion
of positive hits.
Residuals plots for regression models allow you to identify regions
where the predictions are more accurate and less accurate.
Publish to OracleBI Discoverer Gateway
Tools | Publish to Discoverer Gateway uses OracleBI Discoverer Gateway
to publish data mining result in OracleBI Discoverer. You can publish the
following results:
-
Attribute Importance
-
Association Rules
-
Apply Results
-
Decision Tree Rules
-
Clustering Rules
-
Classification Test Metrics
You can also compare classification test metrics.
For requirements, see Publish to OracleBI
Discoverer Requirements.
Follow these steps to publish data mining results to OracleBI Discoverer:
-
Install required BI and OracleBI Discoverer components.
-
Create a new EUL using OracleBI Discoverer.
-
Register the Oracle Data Miner Gateway with the EUL.
-
Use Oracle Data Miner to publish results to OracleBI Discoverer.
-
Add Oracle Data Miner gateway objects as folders in a business area using
OracleBI Discoverer Administration.
Automated Data Mining
Oracle Data Miner include the automated data mining of DBMS_PREDICTIVE_ANALYTICS
described at the end of What's New in Oracle Data Mining
10.2?. Oracle Data Miner includes the following new wizards:
-
Data | Explain
-
Data | Predict
Model Wizards Removed
The model build, apply, and test wizard are no longer supported. The only
way to build, apply, or test a model using Oracle Data Miner is to use
an appropriate mining activity.
What’s New in Oracle Data Mining?
The Oracle Data Mining 10g Release 2 (10.2) includes the following
new algorithms and features:
-
Decision Tree, a classification algorithm that always returns rules
-
Revised Java interface compliant with the Java Data Mining (JDM) standard
-
Native model import and export between different Oracle 10.2 instances
-
One-class Support Vector Machine (SVM) classifiers for anomaly detection
-
Predictive Analytics to automate the later stages of certain data mining
problems
Oracle Data Miner builds, tests, and applies models using the Java interface.
Models built using the ODM 10.1 Java interface are not compatible with
models built using the ODM 10.2 Java interface. There is no automatic way
to migrate ODM programs written using the ODM 10.1 Java interface to programs
that use the ODM 10.2 Java interface. Models built using the ODM 10.2 PL/SQL
interface and the ODM 10.2 Java interface are compatible.
Data mining models may need to be moved between Oracle databases or
schemas. For example, data mining specialists may build and test data mining
models on one dedicated system. After the models are built and tested,
selected models may be deployed to another system used by applications.
Because the system where the models are developed and the system where
the models are deployed usually do not share the same database, the model
must be exported from the system where it was developed and then imported
to system where it will be used by applications.
Anomaly Detection models use the one-class SVM algorithm to build models
when there are no counterexamples.
Predictive Analytics is based on the PL/SQL package DBMS_PREDICTIVE_ANALYTICS
that automates the later stages of data mining; it provides the following
functionality:
-
Explain to rank attributes in order of influence in explaining a
target column
-
Predict to predict the value of a target attribute (categorical
or numerical)
For more information about the new features, see ODM Documentation.
ODM Documentation
Oracle Data Mining 10g Release 2 (10.2) documentation is part of
the Oracle
Database 10g Release 2 Documentation Library. To find ODM documentation,
view or download the library; then click the Data Warehousing tab.
Oracle Data Miner Tutorial
The tutorial for Oracle Data Miner 10.2 is not available for Beta.
How to Start Oracle Data Miner
Start Oracle Data Miner as follows:
-
On Microsoft Windows systems, double-click MINER_HOME\bin\odminerw.exe,
where MINER_HOME is the folder where Oracle Data Miner is installed.
-
On UNIX and Linux systems, run the script odminer in the directory
MINER_HOME/bin, where MINER_HOME is the directory where
Oracle Data Miner is installed.
Define a Database Connection
When you start Oracle Data Miner for the first time, you must define a
database connection.
NOTE: The user name and password that you specify when you define
a connection must satisfy the requirements of Oracle Data Mining. Oracle
Data Mining requires a small number of database permissions, plus SELECT
access to the tables containing data for analysis. For details, see the
Oracle Data Mining Administrator's Guide.
The first time that you start Oracle Data Miner, a dialog appears asking
for the following information:
-
Connection Name, the name of the connection
-
User, the name of the ODM user schema where data mining will take
place
-
Password, the ODM user password
-
Host, the system where ODM is installed
-
Port, the port number for the connection
-
Sid, the SID for the database where ODM is installed
Click OK when you finish the definition. You are returned to the
Choose Connection dialog. You can now select the connection that
you just defined from the dropdown box.
You may need to contact your ODM DBA for this information.
You can define additional connections and edit existing ones:
-
To define a new connection, click New in the Choose Connection
dialog.
-
To edit an existing connection, select the connection in the Connection
dropdown that you want to edit, and click Edit.
Oracle Data Miner Install and Uninstall
Installation instructions depend on the target operating system; you can
install Oracle Data Miner on the following operating systems:
-
Microsoft Windows
-
UNIX and Linux
Before you can use Oracle Data Miner, you must connect to an appropriate
account in an Oracle 10gRelease 2 database. Before you can connect,
you must install ODM 10g Release 2 and create at least one user
account for data mining. For information about how to do this, see Oracle
Data Mining Administrator's Guide and the installation instructions
for the platform that you are using.
Oracle Data Miner Requirements
The following describe Oracle Data Miner requirements. These requirements
must be satisfied before you try to build models.
-
The Data Mining option to Oracle 10g Release 2 (10.2) Enterprise
Edition must be installed.
-
Anyone who wants to use Oracle Database must have a user name and password.
Oracle Data Mining requires a small number of database permissions, plus
SELECT access to the tables containing data for analysis. The dmshgrants
SQL script assigns all of the necessary permissions. For details, see Oracle
Data Mining Administrator's Guide.
-
On all platforms, you must have JDK 1.4.2.
-
If you wish to use the ODM sample programs, you must install them from
Companion CD, as described in Oracle Data Mining Administrator's Guide.
Note that Oracle Data Miner and ODM do not have to be installed on the
same system. For example, you could install ODM on a system running UNIX
and Oracle Data Miner on a PC running Windows.
Text Mining Requirements
The following restrictions apply to text mining using Oracle Data Miner:
-
You can include one text column in a mining operation.
-
The text column must reside in a table, not a view. A mining activity will
create a table, if necessary.
-
The text column must have data type CLOB, BLOB, BFILE, LONG, VARCHAR2,
XMLType, CHAR, RAW, or LONG RAW.
-
You must have Oracle Text installed on the Server. Oracle Text is required
for indexing text columns. Oracle Text is part of Oracle 10g server;
the required INDEXTYPE ctxsys.context is part of the
Seed Database. Oracle Text is installed by default when you install the
Oracle Database. If you explicitly exclude it, you will not be able to
use Oracle Data Miner for text mining.
Publish to OracleBI Discoverer Requirements
If you just want to publish data mining results to the Oracle Data Miner
Discoverer Gateway, no software is required in addition to Oracle Data
Miner. If you intend to use published mining results in OracleBI Discoverer,
the following software is required; in each case click the link to download
the software:
After you install the software, register the Oracle Data Miner Discoverer
Gateway in the EUL to be able to access the published data mining results
from a Discoverer Administrator:
-
If an EUL does not exist, create a new EUL.
Click these links for information about creating an a EUL in two
different cases:
-
Register the Oracle Data Miner Discoverer Gateway with the EUL:
Execute the following SQL script in the EUL user to register the
Oracle Data Miner Discoverer Gateway with the EUL:
-- registration script
insert into EUL5_GATEWAYS(
gw_id, -- Gateway ID
gw_type, -- Type of Gateway
gw_gateway_name, -- Name of Gateway
gw_product_name, -- Name of the product
gw_description, -- Description of the gateway
egw_version, -- version of the gateway
egw_database_link, -- For remote DB provide dblink
egw_schema, -- Gateway owner
egw_sql_paradigm, -- SQL paradigm
gw_element_state, -- element state
gw_created_by, -- who created this gateway
gw_created_date, -- when it was created
gw_updated_by, -- who updated this gateway
gw_updated_date, -- when it was updated
notm
)
values
(
EUL5_ID_SEQ.NEXTVAL,
'EGW',
'ODMr 10.2 Discoverer Gateway',
'Oracle Data Mining',
'This gateway provides data mining results accessible to OracleBI',
'1.1',
NULL, -- dblink if dmuser is in remote
'DMUSER', -- Change to the schema as needed
'OBJECT',
0,
USER,
SYSDATE,
USER,
SYSDATE,
0
)
Upgrading Oracle Data Miner
Models built using Oracle Data Miner 10.1 cannot be used with Oracle Data
Miner 10.2.
Oracle Data Miner requires Oracle 10g Release 2; you cannot connect
to any other version of Oracle Database.
Oracle Data Miner 10.1 preferences and connection information will be
migrated automatically when you first launch Data Miner 10.2. You may have
to reenter passwords.
You can have both Oracle Data Miner 10.1 and Oracle Data Miner 10.2
installed on the same system.
Installation on Microsoft Windows
Follow these steps to install Oracle Data Miner on Microsoft Windows:
-
Download odminer.zip.
-
Unzip the entire contents of odminer.zip to the desired Oracle
Data Miner root directory, for example, unzip to C:\odminer.
-
Run (double click) MINER_HOME\bin\odminerw.exe, where MINER_HOME
is the folder where Oracle Data Miner is installed. For example, execute
C:\odminer\bin\odminerw.exe
-
Define a connection as described in Define a Database
Connection.
NOTE: odminer.exe (without the w in its name) displays
a console window that can be used for troubleshooting purposes.
Installation on UNIX or Linux
Installation on UNIX or Linux is similar to installation on Microsoft Windows:
-
Oracle Data Miner on UNIX or Linux requires Java JDK 1.4.2. To check the
version of Java, use the command:
java -version
-
Download odminer.zip.
-
Unzip odminer.zip to the desired Oracle Data Miner root directory;
for example, use the following command to unzip the file to the directory
odminer in the current working directory using the unzip
command:
unzip odminer.zip -d odminer
This command creates the directory odminer (in the current
working directory) and inflates the archive into it.
-
To start Oracle Data Miner, run the script odminer in the directory
MINER_HOME/bin, where MINER_HOME is the directory where
Oracle Data Miner is installed. If the script is not executable, reset
the permissions:
chmod +x odminer
-
Define a connection as described in Define a Database
Connection.
Uninstall
To uninstall Oracle Data Miner on any platform, delete the directory where
you installed Oracle Data Miner. Make sure that you delete all of the subdirectories
of the directory where you installed Oracle Data Miner.
Oracle Data Miner Notes
The following notes apply to Oracle Data Miner:
-
This version of Oracle Data Miner is Beta software. All features of Oracle
Data Miner are present.
-
Help is not available for many features; complete help will not be available
until final release.
-
The tutorial is not available for Beta.
-
Oracle Data Mining 10g Release 2 supports two interfaces, a Java
interface and a PL/SQL interface. The ODM 10.2 Java and PL/SQL interfaces
are compatible; for example, you can use the Java interface to apply a
model built using the PL/SQL interface. Oracle Data Miner builds models
and creates results using the ODM 10.2 Java interface. Oracle Data Miner
can be used with mining objects created using either the ODM 10.2 Java
or PL/SQL interface.
-
File Import requires SQL*Loader. You must install Oracle Administrative
Client to have SQL*Loader installed.
-
The user name that you specify when you connect must be the name of a database
user account with the appropriate permissions. See Oracle Data Mining
Administrator's Guide for information about how to create such accounts.
-
Data Miner ships with English only.
Oracle Data Miner Bugs
The following are Oracle Data Miner bugs and Oracle Data Mining bugs that
affect Oracle Data Miner:
-
4610865 - PERFORMANCE ISSUE WITH DISPLAYING HISTOGRAMS ON LARGE DATASET
-
4551579 - MISSING/INCONSISTENT DISPLAY OF NULL VALUES IN SUMMARIZATION
-
Online Help is incomplete.
-
Localization implementation (number rendering) is not validated.
-
ABN Rules not available in Apply Result. Will be fixed with DB 10.2.0.2.0
Patch
-
Model Build Activity Data Usage Validations are not complete.
-
Disco Gateway Publishing of Association Rules is not picking up the
item lookup descriptions.
-
Disco Gateway Publishing of Classification Test Metrics fails.
-
Model Activity Apply is currently generating a table from the selected
case table. This will be changed to a view.
-
Sample Step will fail if you change the default options inappropriately.
For example, for a non-classificaiton model you set sample to stratified
sample.
-
Text processing - JDM api bug results in the column name being used
to generate internal feature id table. You can only use the column name
once per account.
-
Performance issues with build activity when processing large tables/views.
Oracle is a registered trademark of Oracle Corporation.
Other names may be trademarks of their respective owners.
Copyright © 2005 Oracle. All rights reserved.