Oracle Data Mining (ODM) allows the combination of text and non-text (traditional categorical and numerical)
columns of data to build regression, classification, and feature extraction models.
A text column is a column with one of the following data types: VARCHAR2
,
CHAR
, BFILE
, XMLTYPE
, URITYPE
, BLOB
, CLOB
, RAW
, or LONG RAW
. For example, in a medical application,
the input table might consist of measurements (numeric values representing temperature,
blood pressure, or other measurements) and a text column consisting of physician's comments.
For detailed information about text mining in ODM, see the Oracle Data Mining Concepts.
ODM supports text mining as follows:
The following restrictions apply to text mining using Oracle Data Miner:
VARCHAR2
,
CHAR
, BFILE
, XMLType
, URITYPE
, BLOB
, CLOB
, RAW
, or LONG RAW
.The text column in a table must be transformed before you use the table in data mining operations. The text column must be transformed into nested tables consisting of items of type DM_Nested_Numericals
or DM_Nested_Categoricals
. All text columns in a sequence of mining operations must be transformed in the same way. For example, the text column in tables used to build, test, and apply a given model must all be prepared in the same way.
In most text mining operations, you do not mine text directly; instead, you mine text features. The Text Transformation Wizard automatically extracts features from the text.
To extract text features and convert text columns to nested tables, use the Text Transformation Wizard at Data | Transform | Text.
Copyright © 2005, Oracle. All rights reserved.