Copyright (c) 2003, 2004 Oracle.  All rights reserved.

PL/SQL SAMPLE PROGRAMS FOR ORACLE 10g DATA MINING
-------------------------------------------------

This directory, "$ORACLE_HOME/dm/demo/sample/plsql" on UNIX or
"%ORACLE_HOME%\dm\demo\sample\plsql" on Windows, contains sample programs
for Oracle Data Mining (ODM) illustrating the PL/SQL interface.

There are two kinds of sample programs:
 o The ones that use data in the data mining user schema and named like
   *demo.sql
 o The ones that use data in sample schema SH and named like *_sh.sql.

You can run the sample programs as a data mining user or as any other
database user that has the same privileges as a data mining user. You
can refer to "$ORACLE_HOME/dm/admin/odmuser.sql" on UNIX or
"%ORACLE_HOME%\dm\admin\odmuser.sql" on Windows for the privileges
required to execute data mining sample programs.  


DATA FOR SAMPLE PROGRAMS
------------------------

The data used by the sample programs named like *demo.sql is in 
"$ORACLE_HOME/dm/demo/data" on UNIX or "%ORACLE_HOME%\dm\demo\data" on
Windows. If you have not already loaded the data, load the sample data
into an appropriate user schema before you run the sample programs. ODM
includes scripts that create an Oracle tablespace, create a user
account, and load the tables.

     For UNIX, use these scripts in "$ORACLE_HOME/dm/admin":
            odmtbs.sql
            odmuser.sql
            dmuserld.sql
     For Windows, use these scripts in "%ORACLE_HOME%\dm\admin":
            odmtbs.sql
            odmuser.sql
            dmuserld.sql

The data used by the sample programs named like *_sh.sql is available in SH
schema. To grant the necessary access privileges to the data mining user your
database administrator needs to execute the following script:

     For UNIX, use the script 
            "$ORACLE_HOME/dm/admin/dmshgrants.sql" 
     For Windows, use the script  
            "%ORACLE_HOME%\dm\admin\dmshgrants.sql"

Then, you must run the following script as a data mining user to create the
necessary tables and views:

     For UNIX, use the script
            "$ORACLE_HOME/dm/admin/dmsh.sql"
     For Windows, use the script
            "%ORACLE_HOME%\dm\admin\dmsh.sql"

For more information about loading the data, see the "Oracle Data Mining
Administrator's Guide". 


INFORMATION ABOUT DEMO PROGRAMS
-------------------------------

1. To execute the PL/SQL samples, connect to an appropriate user such
   as:

       sqlplus dmuser/<password>

   Once you enter the session you may want to execute the following:

       SQL> set echo on -- to enable display of the program while running 
       SQL> set serveroutput on -- to enable outputs from dbms_output
       SQL> @nbdemo.sql
   
   This will display the content of the nbdemo.sql file as it runs,
   and the outputs at each stage of the run.

2. The sample programs are re-executable. Each program cleans up the
   results from a previous run before execution of the current run.

3. The sample programs for data mining functions are organized as follows:
  
   CLASSIFICATION
   --------------
   These programs demonstrate pre-processing of the build, test and apply
   data using the DBMS_DATA_MINING_TRANSFORM package. Numerical and
   Categorical attributes are binned as part of this step. Then
    - a model is built using training data
    - the model details and settings table are presented
    - the model is tested by applying the model on test data
    - test metrics like confusion matrix, lift, and ROC are presented
    - the model is then applied on scoring data
    - apply results are presented
    - ranked apply results, influenced by a cost matrix, are presented.
   
   nbdemo.sql   - Naive Bayes classification
   svmcdemo.sql - SVM classification
   abndemo.sql  - ABN classification

   nb_sh.sql    - Naive Bayes classification using data in SH
   svmc_sh.sql  - SVM classification using data in SH
   abn_sh.sql   - ABN classification using data in SH

   REGRESSION
   ----------
   The steps are nearly the same as classification, with a few of the
   test metrics (that are not typically applicable for regression) not
   being presented in the output. Selected attributes of the input
   data are preprocessed (normalized).

   svmrdemo.sql - SVM regression
   svmr_sh.sql - SVM regression using data in SH

   ASSOCIATION
   -----------
   An association model is built, and frequent itemsets and association
   rules are presented as output. Selected attributes of the input
   data are preprocessed (binned).

   ardemo.sql - Association model
   ar_sh.sql  - Association model using data in SH

   CLUSTERING
   ----------
   A clustering model is built, and cluster details such as cluster rules,
   centroid and histogram for each cluster are presented as output.
   The model is also used for scoring; probabilities associated with
   each cluster are returned as output. Selected attributes of the input
   data are preprocessed (normalized).

   kmdemo.sql - K-Means clustering model
   km_sh.sql  - K-Means clustering model using data in SH

   FEATURE EXTRACTION
   ------------------
   A feature extractor model is built, and the details of the model are
   presented as output. The model is used for scoring new data and
   each feature ID is associated with a probability as scoring result.
   Selected attributes of the input data are preprocessed (normalized).

   nmfdemo.sql - NMF model
   nmf_sh.sql  - NMF model using data in SH

   ATTRIBUTE IMPORTANCE
   --------------------
   An attribute importance model is built, and the output of model details
   provides the list of important attributes. Selected attributes of the
   input data are preprocessed (binned).

   nmfdemo.sql - NMF model
   nmf_sh.sql  - NMF model using data in SH

   BLAST
   -----
   This script demonstrates the use of BLAST table functions against
   the gene data. All BLAST interfaces are demonstrated in this program.

   blastdemo.sql - BLAST demonstration

   TRANSFORMATION
   --------------
   This script is a standalone demonstration for DBMS_DATA_MINING_TRANSFORM.

   xfdemo.sql - Transformation demonstration

   TEXT MINING
   -----------
   Text mining requires that Oracle Text be available.

   There are two basic steps to perform Text Mining using the
   DBMS_DATA_MINING API:

   1. Extract features from a table containing Text documents
      into an input table which constitutes training data.
      Since each document (i.e. case) can contain hundreds of
      features, these features should be provided through a
      nested table column for a given case.

   2. Once such an input table is constructed, then you can use
      the generic API to mine this feature data using a classifier
      such as SVM, or a feature extractor such as NMF.

   There are two samples provided to demonstrate this capability:

   textfe.sql - demonstrates feature extraction into a table
                with nested columns storing the feature data.
                This is a standalone example; the data generated
                cannot be used as input for the next two samples.

                To use this sample, a Text Index must be built
                apriori against the (CLOB or VARCHAR2) column
                storing the text documents.

   textsvmc.sql - demonstrates classification of text data. The
                  training data contains a few mining attributes
                  provided as table columns, along with text features
                  provided via a nested table column.

   textnmf.sql - demonstrates feature extraction of text data. The
                 training data contains a few mining attributes
                 provided as table columns, along with text features
                 provided via a nested table column.

