26 Native Datatypes

This chapter discusses the Oracle built-in datatypes, their properties, and how they map to non-Oracle datatypes.

This chapter contains the following topics:

Introduction to Oracle Datatypes

Each column value and constant in a SQL statement has a datatype, which is associated with a specific storage format, constraints, and a valid range of values. When you create a table, you must specify a datatype for each of its columns.

Oracle provides the following categories of built-in datatypes:

See Also:

The following sections that describe each of the built-in datatypes in more detail.

Overview of Character Datatypes

The character datatypes store character (alphanumeric) data in strings, with byte values corresponding to the character encoding scheme, generally called a character set or code page.

The database's character set is established when you create the database. Examples of character sets are 7-bit ASCII (American Standard Code for Information Interchange), EBCDIC (Extended Binary Coded Decimal Interchange Code), Code Page 500, Japan Extended UNIX, and Unicode UTF-8. Oracle supports both single-byte and multibyte encoding schemes.

See Also:

CHAR Datatype

The CHAR datatype stores fixed-length character strings. When you create a table with a CHAR column, you must specify a string length (in bytes or characters) between 1 and 2000 bytes for the CHAR column width. The default is 1 byte. Oracle then guarantees that:

  • When you insert or update a row in the table, the value for the CHAR column has the fixed length.

  • If you give a shorter value, then the value is blank-padded to the fixed length.

  • If a value is too large, Oracle returns an error.

Oracle compares CHAR values using blank-padded comparison semantics.

See Also:

Oracle Database SQL Reference for details about blank-padded comparison semantics

VARCHAR2 and VARCHAR Datatypes

The VARCHAR2 datatype stores variable-length character strings. When you create a table with a VARCHAR2 column, you specify a maximum string length (in bytes or characters) between 1 and 4000 bytes for the VARCHAR2 column. For each row, Oracle stores each value in the column as a variable-length field unless a value exceeds the column's maximum length, in which case Oracle returns an error. Using VARCHAR2 and VARCHAR saves on space used by the table.

For example, assume you declare a column VARCHAR2 with a maximum size of 50 characters. In a single-byte character set, if only 10 characters are given for the VARCHAR2 column value in a particular row, the column in the row's row piece stores only the 10 characters (10 bytes), not 50.

Oracle compares VARCHAR2 values using nonpadded comparison semantics.

See Also:

Oracle Database SQL Reference for details about nonpadded comparison semantics

VARCHAR Datatype

The VARCHAR datatype is synonymous with the VARCHAR2 datatype. To avoid possible changes in behavior, always use the VARCHAR2 datatype to store variable-length character strings.

Length Semantics for Character Datatypes

Globalization support allows the use of various character sets for the character datatypes. Globalization support lets you process single-byte and multibyte character data and convert between character sets. Client sessions can use client character sets that are different from the database character set.

Consider the size of characters when you specify the column length for character datatypes. You must consider this issue when estimating space for tables with columns that contain character data.

The length semantics of character datatypes can be measured in bytes or characters.

  • Byte semantics treat strings as a sequence of bytes. This is the default for character datatypes.

  • Character semantics treat strings as a sequence of characters. A character is technically a codepoint of the database character set.

For single byte character sets, columns defined in character semantics are basically the same as those defined in byte semantics. Character semantics are useful for defining varying-width multibyte strings; it reduces the complexity when defining the actual length requirements for data storage. For example, in a Unicode database (UTF8), you need to define a VARCHAR2 column that can store up to five Chinese characters together with five English characters. In byte semantics, this would require (5*3 bytes) + (1*5 bytes) = 20 bytes; in character semantics, the column would require 10 characters.

VARCHAR2(20 BYTE) and SUBSTRB(<string>, 1, 20) use byte semantics. VARCHAR2(10 CHAR) and SUBSTR(<string>, 1, 10) use character semantics.

The parameter NLS_LENGTH_SEMANTICS decides whether a new column of character datatype uses byte or character semantics. The default length semantic is byte. If all character datatype columns in a database use byte semantics (or all use character semantics) then users do not have to worry about which columns use which semantics. The BYTE and CHAR qualifiers shown earlier should be avoided when possible, because they lead to mixed-semantics databases. Instead, the NLS_LENGTH_SEMANTICS initialization parameter should be set appropriately in the server parameter file (SPFILE) or initialization parameter file, and columns should use the default semantics.

See Also:

NCHAR and NVARCHAR2 Datatypes

NCHAR and NVARCHAR2 are Unicode datatypes that store Unicode character data. The character set of NCHAR and NVARCHAR2 datatypes can only be either AL16UTF16 or UTF8 and is specified at database creation time as the national character set. AL16UTF16 and UTF8 are both Unicode encoding.

  • The NCHAR datatype stores fixed-length character strings that correspond to the national character set.

  • The NVARCHAR2 datatype stores variable length character strings.

When you create a table with an NCHAR or NVARCHAR2 column, the maximum size specified is always in character length semantics. Character length semantics is the default and only length semantics for NCHAR or NVARCHAR2.

For example, if national character set is UTF8, then the following statement defines the maximum byte length of 90 bytes:

CREATE TABLE tab1 (col1 NCHAR(30));

This statement creates a column with maximum character length of 30. The maximum byte length is the multiple of the maximum character length and the maximum number of bytes in each character.

NCHAR

The maximum length of an NCHAR column is 2000 bytes. It can hold up to 2000 characters. The actual data is subject to the maximum byte limit of 2000. The two size constraints must be satisfied simultaneously at run time.

NVARCHAR2

The maximum length of an NVARCHAR2 column is 4000 bytes. It can hold up to 4000 characters. The actual data is subject to the maximum byte limit of 4000. The two size constraints must be satisfied simultaneously at run time.

See Also:

Oracle Database Globalization Support Guide for more information about the NCHAR and NVARCHAR2 datatypes

Use of Unicode Data in an Oracle Database

Unicode is an effort to have a unified encoding of every character in every language known to man. It also provides a way to represent privately-defined characters. A database column that stores Unicode can store text written in any language.

Oracle users deploying globalized applications have a strong need to store Unicode data in Oracle databases. They need a datatype which is guaranteed to be Unicode regardless of the database character set.

Oracle supports a reliable Unicode datatype through NCHAR, NVARCHAR2, and NCLOB. These datatypes are guaranteed to be Unicode encoding and always use character length semantics. The character sets used by NCHAR/NVARCHAR2 can be either UTF8 or AL16UTF16, depending on the setting of the national character set when the database is created. These datatypes allow character data in Unicode to be stored in a database that may or may not use Unicode as database character set.

Implicit Type Conversion

In addition to all the implicit conversions for CHAR/VARCHAR2, Oracle also supports implicit conversion for NCHAR/NVARCHAR2. Implicit conversion between CHAR/VARCHAR2 and NCHAR/NVARCHAR2 is also supported.

LOB Character Datatypes

The LOB datatypes for character data are CLOB and NCLOB. They can store up to 8 terabytes of character data (CLOB) or national character set data (NCLOB).

LONG Datatype

Note:

Do not create tables with LONG columns. Use LOB columns (CLOB, NCLOB) instead. LONG columns are supported only for backward compatibility.

Oracle also recommends that you convert existing LONG columns to LOB columns. LOB columns are subject to far fewer restrictions than LONG columns. Further, LOB functionality is enhanced in every release, whereas LONG functionality has been static for several releases.

Columns defined as LONG can store variable-length character data containing up to 2 gigabytes of information. LONG data is text data that is to be appropriately converted when moving among different systems.

LONG datatype columns are used in the data dictionary to store the text of view definitions. You can use LONG columns in SELECT lists, SET clauses of UPDATE statements, and VALUES clauses of INSERT statements.

See Also:

Overview of Numeric Datatypes

The numeric datatypes store positive and negative fixed and floating-point numbers, zero, infinity, and values that are the undefined result of an operation (that is, is "not a number" or NAN).

NUMBER Datatype

The NUMBER datatype stores fixed and floating-point numbers. Numbers of virtually any magnitude can be stored and are guaranteed portable among different systems operating Oracle, up to 38 digits of precision.

The following numbers can be stored in a NUMBER column:

  • Positive numbers in the range 1 x 10-130 to 9.99...9 x 10125 with up to 38 significant digits

  • Negative numbers from -1 x 10-130 to 9.99...99 x 10125 with up to 38 significant digits

  • Zero

  • Positive and negative infinity (generated only by importing from an Oracle Version 5 database)

For numeric columns, you can specify the column as:

column_name NUMBER 

Optionally, you can also specify a precision (total number of digits) and scale (number of digits to the right of the decimal point):

column_name NUMBER (precision, scale) 

If a precision is not specified, the column stores values as given. If no scale is specified, the scale is zero.

Oracle guarantees portability of numbers with a precision equal to or less than 38 digits. You can specify a scale and no precision:

column_name NUMBER (*, scale) 

In this case, the precision is 38, and the specified scale is maintained.

When you specify numeric fields, it is a good idea to specify the precision and scale. This provides extra integrity checking on input.

Table 26-1 shows examples of how data would be stored using different scale factors.

Table 26-1 How Scale Factors Affect Numeric Data Storage

Input Data Specified As Stored As

7,456,123.89

NUMBER

7456123.89

7,456,123.89

NUMBER(*,1)

7456123.9

7,456,123.89

NUMBER(9)

7456124

7,456,123.89

NUMBER(9,2)

7456123.89

7,456,123.89

NUMBER(9,1)

7456123.9

7,456,123.89

NUMBER(6)

(not accepted, exceeds precision)

7,456,123.89

NUMBER(7,-2)

7456100


If you specify a negative scale, then Oracle rounds the actual data to the specified number of places to the left of the decimal point. For example, specifying (7,-2) means Oracle rounds to the nearest hundredths, as shown in Table 26-1.

For input and output of numbers, the standard Oracle default decimal character is a period, as in the number 1234.56. The decimal is the character that separates the integer and decimal parts of a number. You can change the default decimal character with the initialization parameter NLS_NUMERIC_CHARACTERS. You can also change it for the duration of a session with the ALTER SESSION statement. To enter numbers that do not use the current default decimal character, use the TO_NUMBER function.

Internal Numeric Format

Oracle stores numeric data in variable-length format. Each value is stored in scientific notation, with 1 byte used to store the exponent and up to 20 bytes to store the mantissa. The resulting value is limited to 38 digits of precision. Oracle does not store leading and trailing zeros. For example, the number 412 is stored in a format similar to 4.12 x 102, with 1 byte used to store the exponent(2) and 2 bytes used to store the three significant digits of the mantissa(4,1,2). Negative numbers include the sign in their length.

Taking this into account, the column size in bytes for a particular numeric data value NUMBER(p), where p is the precision of a given value, can be calculated using the following formula:

ROUND((length(p)+s)/2))+1

where s equals zero if the number is positive, and s equals 1 if the number is negative.

Zero and positive and negative infinity (only generated on import from Version 5 Oracle databases) are stored using unique representations. Zero and negative infinity each require 1 byte; positive infinity requires 2 bytes.

Floating-Point Numbers

Oracle provides two numeric datatypes exclusively for floating-point numbers: BINARY_FLOAT and BINARY_DOUBLE. They support all of the basic functionality provided by the NUMBER datatype. However, while NUMBER uses decimal precision, BINARY_FLOAT and BINARY_DOUBLE use binary precision. This enables faster arithmetic calculations and usually reduces storage requirements.

BINARY_FLOAT and BINARY_DOUBLE are approximate numeric datatypes. They store approximate representations of decimal values, rather than exact representations. For example, the value 0.1 cannot be exactly represented by either BINARY_DOUBLE or BINARY_FLOAT. They are frequently used for scientific computations. Their behavior is similar to the datatypes FLOAT and DOUBLE in Java and XMLSchema.

BINARY_FLOAT Datatype

BINARY_FLOAT is a 32-bit, single-precision floating-point number datatype. Each BINARY_FLOAT value requires 5 bytes, including a length byte.

BINARY_DOUBLE Datatype

BINARY_DOUBLE is a 64-bit, double-precision floating-point number datatype. Each BINARY_DOUBLE value requires 9 bytes, including a length byte.

Note:

BINARY_DOUBLE and BINARY_FLOAT implement most of the Institute of Electrical and Electronics Engineers (IEEE) Standard for Binary Floating-Point Arithmetic, IEEE Standard 754-1985 (IEEE754). For a full description of the Oracle implementation of floating-point numbers and its differences from IEEE754, see the Oracle Database SQL Reference

Overview of DATE Datatype

The DATE datatype stores point-in-time values (dates and times) in a table. The DATE datatype stores the year (including the century), the month, the day, the hours, the minutes, and the seconds (after midnight).

Oracle can store dates in the Julian era, ranging from January 1, 4712 BCE through December 31, 4712 CE (Common Era, or 'AD'). Unless BCE ('BC' in the format mask) is specifically used, CE date entries are the default.

Oracle uses its own internal format to store dates. Date data is stored in fixed-length fields of seven bytes each, corresponding to century, year, month, day, hour, minute, and second.

For input and output of dates, the standard Oracle date format is DD-MON-YY, as follows:

'13-NOV-92' 

You can change this default date format for an instance with the parameter NLS_DATE_FORMAT. You can also change it during a user session with the ALTER SESSION statement. To enter dates that are not in standard Oracle date format, use the TO_DATE function with a format mask:

TO_DATE ('November 13, 1992', 'MONTH DD, YYYY') 

Oracle stores time in 24-hour format—HH:MI:SS. By default, the time in a date field is 00:00:00 A.M. (midnight) if no time portion is entered. In a time-only entry, the date portion defaults to the first day of the current month. To enter the time portion of a date, use the TO_DATE function with a format mask indicating the time portion, as in:

INSERT INTO birthdays (bname, bday) VALUES 
    ('ANDY',TO_DATE('13-AUG-66 12:56 A.M.','DD-MON-YY HH:MI A.M.')); 

Use of Julian Dates

Julian dates allow continuous dating by the number of days from a common reference. (The reference is 01-01-4712 years BCE, so current dates are somewhere in the 2.4 million range.) A Julian date is nominally a noninteger, the fractional part being a portion of a day. Oracle uses a simplified approach that results in integer values. Julian dates can be calculated and interpreted differently. The calculation method used by Oracle results in a seven-digit number (for dates most often used), such as 2449086 for 08-APR-93.

Note:

Oracle Julian dates might not be compatible with Julian dates generated by other date algorithms.

The format mask 'J' can be used with date functions (TO_DATE or TO_CHAR) to convert date data into Julian dates. For example, the following query returns all dates in Julian date format:

SELECT TO_CHAR (hire_date, 'J') FROM employees; 

You must use the TO_NUMBER function if you want to use Julian dates in calculations. You can use the TO_DATE function to enter Julian dates:

INSERT INTO employees (hire_date) VALUES (TO_DATE(2448921, 'J')); 

Date Arithmetic

Oracle date arithmetic takes into account the anomalies of the calendars used throughout history. For example, the switch from the Julian to the Gregorian calendar, 15-10-1582, eliminated the previous 10 days (05-10-1582 through 14-10-1582). The year 0 does not exist.

You can enter missing dates into the database, but they are ignored in date arithmetic and treated as the next "real" date. For example, the next day after 04-10-1582 is 15-10-1582, and the day following 05-10-1582 is also 15-10-1582.

Note:

This discussion of date arithmetic might not apply to all countries' date standards (such as those in Asia).

Centuries and the Year 2000

Oracle stores year data with the century information. For example, the Oracle database stores 1996 or 2001, and not simply 96 or 01. The DATE datatype always stores a four-digit year internally, and all other dates stored internally in the database have four digit years. Oracle utilities such as import, export, and recovery also deal with four-digit years.

Daylight Savings Support

Oracle Database provides daylight savings support for DATETIME datatypes in the server. You can insert and query DATETIME values based on local time in a specific region. The DATETIME datatypes TIMESTAMP WITH TIME ZONE and TIMESTAMP WITH LOCAL TIME ZONE are time-zone aware.

See Also:

Time Zones

You can include the time zone in your date/time data and provides support for fractional seconds. Three new datatypes are added to DATE, with the following differences:

Datatype Time Zone Fractional Seconds
DATE No No
TIMESTAMP No Yes
TIMESTAMP WITH TIME ZONE Explicit Yes
TIMESTAMP WITH LOCAL TIME ZONE Relative Yes

TIMESTAMP WITH LOCAL TIME ZONE is stored in the database time zone. When a user selects the data, the value is adjusted to the user's session time zone.

For example, a San Francisco database has system time zone = -8:00. When a New York client (session time zone = -5:00) inserts into or selects from the San Francisco database, TIMESTAMP WITH LOCAL TIME ZONE data is adjusted as follows:

  • The New York client inserts TIMESTAMP'1998-1-23 6:00:00-5:00' into a TIMESTAMP WITH LOCAL TIME ZONE column in the San Francisco database. The inserted data is stored in San Francisco as binary value 1998-1-23 3:00:00.

  • When the New York client selects that inserted data from the San Francisco database, the value displayed in New York is '1998-1-23 6:00:00'.

  • A San Francisco client, selecting the same data, see the value '1998-1-23 3:00:00'.

Note:

To avoid unexpected results in your DML operations on datatime data, you can verify the database and session time zones by querying the built-in SQL functions DBTIMEZONE and SESSIONTIMEZONE. If the database time zone or the session time zone has not been set manually, Oracle uses the operating system time zone by default. If the operating system time zone is not a valid Oracle time zone, then Oracle uses UTC as the default value.

See Also:

Oracle Database SQL Reference for details about the syntax of creating and entering data in time stamp columns

Overview of LOB Datatypes

The LOB datatypes BLOB, CLOB, NCLOB, and BFILE enable you to store and manipulate large blocks of unstructured data (such as text, graphic images, video clips, and sound waveforms) in binary or character format. They provide efficient, random, piece-wise access to the data. Oracle recommends that you always use LOB datatypes over LONG datatypes. You can perform parallel queries (but not parallel DML or DDL) on LOB columns.

LOB datatypes differ from LONG and LONG RAW datatypes in several ways. For example:

  • A table can contain multiple LOB columns but only one LONG column.

  • A table containing one or more LOB columns can be partitioned, but a table containing a LONG column cannot be partitioned.

  • The maximum size of a LOB is 8 terabytes, and the maximum size of a LONG is only 2 gigabytes.

  • LOBs support random access to data, but LONGs support only sequential access.

  • LOB datatypes (except NCLOB) can be attributes of a user-defined object type but LONG datatypes cannot.

  • Temporary LOBs that act like local variables can be used to perform transformations on LOB data. Temporary internal LOBs (BLOBs, CLOBs, and NCLOBs) are created in a temporary tablespace and are independent of tables. For LONG datatypes, however, no temporary structures are available.

  • Tables with LOB columns can be replicated, but tables with LONG columns cannot.

SQL statements define LOB columns in a table and LOB attributes in a user-defined object type. When defining LOBs in a table, you can explicitly specify the tablespace and storage characteristics for each LOB.

LOB datatypes can be stored inline (within a table), out-of-line (within a tablespace, using a LOB locator), or in an external file (BFILE datatypes). With compatibility set to Oracle9i or higher, you can use LOBs with SQL VARCHAR operators and functions.

See Also:

BLOB Datatype

The BLOB datatype stores unstructured binary data in the database. BLOBs can store up to 8 terabytes of binary data.

BLOBs participate fully in transactions. Changes made to a BLOB value by the DBMS_LOB package, PL/SQL, or the OCI can be committed or rolled back. However, BLOB locators cannot span transactions or sessions.

CLOB and NCLOB Datatypes

The CLOB and NCLOB datatypes store up to 8 terabytes of character data in the database. CLOBs store database character set data, and NCLOBs store Unicode national character set data. Storing varying-width LOB data in a fixed-width Unicode character set internally enables Oracle to provide efficient character-based random access on CLOBs and NCLOBs.

CLOBs and NCLOBs participate fully in transactions. Changes made to a CLOB or NCLOB value by the DBMS_LOB package, PL/SQL, or the OCI can be committed or rolled back. However, CLOB and NCLOB locators cannot span transactions or sessions. You cannot create an object type with NCLOB attributes, but you can specify NCLOB parameters in a method for an object type.

See Also:

Oracle Database Globalization Support Guide for more information about national character set data and Unicode

BFILE Datatype

The BFILE datatype stores unstructured binary data in operating-system files outside the database. A BFILE column or attribute stores a file locator that points to an external file containing the data. BFILEs can store up to 8 terabytes of data.

BFILEs are read only; you cannot modify them. They support only random (not sequential) reads, and they do not participate in transactions. The underlying operating system must maintain the file integrity, security, and durability for BFILEs. The database administrator must ensure that the file exists and that Oracle processes have operating-system read permissions on the file.

Overview of RAW and LONG RAW Datatypes

Note:

The LONG RAW datatype is provided for backward compatibility with existing applications. For new applications, use the BLOB and BFILE datatypes for large amounts of binary data.

Oracle also recommends that you convert existing LONG RAW columns to LOB columns. LOB columns are subject to far fewer restrictions than LONG columns. Further, LOB functionality is enhanced in every release, whereas LONG RAW functionality has been static for several releases.

The RAW and LONG RAW datatypes are used for data that is not to be interpreted (not converted when moving data between different systems) by Oracle. These datatypes are intended for binary data or byte strings. For example, LONG RAW can be used to store graphics, sound, documents, or arrays of binary data. The interpretation depends on the use.

RAW is a variable-length datatype like the VARCHAR2 character datatype, except Oracle Net Services (which connects user sessions to the instance) and the Import and Export utilities do not perform character conversion when transmitting RAW or LONG RAW data. In contrast, Oracle Net Services and Import/Export automatically convert CHAR, VARCHAR2, and LONG data between the database character set and the user session character set, if the two character sets are different.

When Oracle automatically converts RAW or LONG RAW data to and from CHAR data, the binary data is represented in hexadecimal form with one hexadecimal character representing every four bits of RAW data. For example, one byte of RAW data with bits 11001011 is displayed and entered as 'CB'.

LONG RAW data cannot be indexed, but RAW data can be indexed.

See Also:

Oracle Database Application Developer's Guide - Fundamentals for information about other restrictions on the LONG RAW datatype

Overview of ROWID and UROWID Datatypes

Oracle uses a ROWID datatype to store the address (rowid) of every row in the database.

  • Physical rowids store the addresses of rows in ordinary tables (excluding index-organized tables), clustered tables, table partitions and subpartitions, indexes, and index partitions and subpartitions.

  • Logical rowids store the addresses of rows in index-organized tables.

A single datatype called the universal rowid, or UROWID, supports both logical and physical rowids, as well as rowids of foreign tables such as non-Oracle tables accessed through a gateway.

A column of the UROWID datatype can store all kinds of rowids. The value of the COMPATIBLE initialization parameter (for file format compatibility) must be set to 8.1 or higher to use UROWID columns.

The ROWID Pseudocolumn

Each table in an Oracle database internally has a pseudocolumn named ROWID. This pseudocolumn is not evident when listing the structure of a table by executing a SELECT * FROM ... statement, or a DESCRIBE ... statement using SQL*Plus, nor does the pseudocolumn take up space in the table. However, each row's address can be retrieved with a SQL query using the reserved word ROWID as a column name, for example:

SELECT ROWID, last_name FROM employees; 

You cannot set the value of the pseudocolumn ROWID in INSERT or UPDATE statements, and you cannot delete a ROWID value. Oracle uses the ROWID values in the pseudocolumn ROWID internally for the construction of indexes.

You can reference rowids in the pseudocolumn ROWID like other table columns (used in SELECT lists and WHERE clauses), but rowids are not stored in the database, nor are they database data. However, you can create tables that contain columns having the ROWID datatype, although Oracle does not guarantee that the values of such columns are valid rowids. The user must ensure that the data stored in the ROWID column truly is a valid ROWID.

Physical Rowids

Physical rowids provide the fastest possible access to a row of a given table. They contain the physical address of a row (down to the specific block) and allow you to retrieve the row in a single block access. Oracle guarantees that as long as the row exists, its rowid does not change. These performance and stability qualities make rowids useful for applications that select a set of rows, perform some operations on them, and then access some of the selected rows again, perhaps with the purpose of updating them.

Every row in a nonclustered table is assigned a unique rowid that corresponds to the physical address of a row's row piece (or the initial row piece if the row is chained among multiple row pieces). In the case of clustered tables, rows in different tables that are in the same data block can have the same rowid.

A row's assigned rowid remains unchanged unless the row is exported and imported using the Import and Export utilities. When you delete a row from a table and then commit the encompassing transaction, the deleted row's associated rowid can be assigned to a row inserted in a subsequent transaction.

A physical rowid datatype has one of two formats:

  • The extended rowid format supports tablespace-relative data block addresses and efficiently identifies rows in partitioned tables and indexes as well as nonpartitioned tables and indexes. Tables and indexes created by an Oracle8i (or higher) server always have extended rowids.

  • A restricted rowid format is also available for backward compatibility with applications developed with Oracle database version 7 or earlier releases.

Extended Rowids

Extended rowids use a base 64 encoding of the physical address for each row selected. The encoding characters are A-Z, a-z, 0-9, +, and /. For example, the following query:

SELECT ROWID, last_name FROM employees WHERE department_id = 20; 

can return the following row information:

ROWID              LAST_NAME 
------------------ ---------- 
AAAAaoAATAAABrXAAA BORTINS 
AAAAaoAATAAABrXAAE RUGGLES 
AAAAaoAATAAABrXAAG CHEN 
AAAAaoAATAAABrXAAN BLUMBERG 

An extended rowid has a four-piece format, OOOOOOFFFBBBBBBRRR:

  • OOOOOO: The data object number that identifies the database segment (AAAAao in the example). Schema objects in the same segment, such as a cluster of tables, have the same data object number.

  • FFF: The tablespace-relative datafile number of the datafile that contains the row (file AAT in the example).

  • BBBBBB: The data block that contains the row (block AAABrX in the example). Block numbers are relative to their datafile, not tablespace. Therefore, two rows with identical block numbers could reside in two different datafiles of the same tablespace.

  • RRR: The row in the block.

You can retrieve the data object number from data dictionary views USER_OBJECTS, DBA_OBJECTS, and ALL_OBJECTS. For example, the following query returns the data object number for the employees table in the SCOTT schema:

SELECT DATA_OBJECT_ID FROM DBA_OBJECTS 
    WHERE OWNER = 'SCOTT' AND OBJECT_NAME = 'EMPLOYEES'; 

You can also use the DBMS_ROWID package to extract information from an extended rowid or to convert a rowid from extended format to restricted format (or vice versa).

See Also:

Oracle Database Application Developer's Guide - Fundamentals for information about the DBMS_ROWID package

Restricted Rowids

Restricted rowids use a binary representation of the physical address for each row selected. When queried using SQL*Plus, the binary representation is converted to a VARCHAR2/hexadecimal representation. The following query:

SELECT ROWID, last_name FROM employees 
    WHERE department_id = 30; 

can return the following row information:

ROWID              ENAME 
------------------ ---------- 
00000DD5.0000.0001 KRISHNAN 
00000DD5.0001.0001 ARBUCKLE 
00000DD5.0002.0001 NGUYEN 

As shown, a restricted rowid's VARCHAR2/hexadecimal representation is in a three-piece format, block.row.file:

  • The data block that contains the row (block DD5 in the example). Block numbers are relative to their datafile, not tablespace. Two rows with identical block numbers could reside in two different datafiles of the same tablespace.

  • The row in the block that contains the row (rows 0, 1, 2 in the example). Row numbers of a given block always start with 0.

  • The datafile that contains the row (file 1 in the example). The first datafile of every database is always 1, and file numbers are unique within a database.

Examples of Rowid Use

You can use the function SUBSTR to break the data in a rowid into its components. For example, you can use SUBSTR to break an extended rowid into its four components (database object, file, block, and row):

SELECT ROWID, 
       SUBSTR(ROWID,1,6) "OBJECT", 
       SUBSTR(ROWID,7,3) "FIL", 
       SUBSTR(ROWID,10,6) "BLOCK", 
       SUBSTR(ROWID,16,3) "ROW" 
       FROM products; 

ROWID               OBJECT  FIL  BLOCK   ROW 
------------------  ------  ---  ------  ---- 
AAAA8mAALAAAAQkAAA  AAAA8m  AAL  AAAAQk  AAA 
AAAA8mAALAAAAQkAAF  AAAA8m  AAL  AAAAQk  AAF 
AAAA8mAALAAAAQkAAI  AAAA8m  AAL  AAAAQk  AAI 

Or you can use SUBSTR to break a restricted rowid into its three components (block, row, and file):

SELECT ROWID, SUBSTR(ROWID,15,4) "FILE", 
       SUBSTR(ROWID,1,8) "BLOCK", 
       SUBSTR(ROWID,10,4) "ROW" 
       FROM products; 

ROWID               FILE  BLOCK     ROW 
------------------  ----  --------  ---- 
00000DD5.0000.0001  0001  00000DD5  0000 
00000DD5.0001.0001  0001  00000DD5  0001 
00000DD5.0002.0001  0001  00000DD5  0002 

Rowids can be useful for revealing information about the physical storage of a table's data. For example, if you are interested in the physical location of a table's rows (such as for table striping), the following query of an extended rowid tells how many datafiles contain rows of a given table:

SELECT COUNT(DISTINCT(SUBSTR(ROWID,7,3))) "FILES" FROM tablename; 

  FILES 
-------- 
    2 

How Rowids Are Used

Oracle uses rowids internally for the construction of indexes. Each key in an index is associated with a rowid that points to the associated row's address for fast access. End users and application developers can also use rowids for several important functions:

  • Rowids are the fastest means of accessing particular rows.

  • Rowids can be used to see how a table is organized.

  • Rowids are unique identifiers for rows in a given table.

Before you use rowids in DML statements, they should be verified and guaranteed not to change. The intended rows should be locked so they cannot be deleted. Under some circumstances, requesting data with an invalid rowid could cause a statement to fail.

You can also create tables with columns defined using the ROWID datatype. For example, you can define an exception table with a column of datatype ROWID to store the rowids of rows in the database that violate integrity constraints. Columns defined using the ROWID datatype behave like other table columns: values can be updated, and so on. Each value in a column defined as datatype ROWID requires six bytes to store pertinent column data.

Logical Rowids

Rows in index-organized tables do not have permanent physical addresses—they are stored in the index leaves and can move within the block or to a different block as a result of insertions. Therefore their row identifiers cannot be based on physical addresses. Instead, Oracle provides index-organized tables with logical row identifiers, called logical rowids, that are based on the table's primary key. Oracle uses these logical rowids for the construction of secondary indexes on index-organized tables.

Each logical rowid used in a secondary index includes a physical guess, which identifies the block location of the row in the index-organized table at the time the guess was made; that is, when the secondary index was created or rebuilt.

Oracle can use guesses to probe into the leaf block directly, bypassing the full key search. This ensures that rowid access of nonvolatile index-organized tables gives comparable performance to the physical rowid access of ordinary tables. In a volatile table, however, if the guess becomes stale the probe can fail, in which case a primary key search must be performed.

The values of two logical rowids are considered equal if they have the same primary key values but different guesses.

Comparison of Logical Rowids with Physical Rowids

Logical rowids are similar to the physical rowids in the following ways:

  • Logical rowids are accessible through the ROWID pseudocolumn.

    You can use the ROWID pseudocolumn to select logical rowids from an index-organized table. The SELECT ROWID statement returns an opaque structure, which internally consists of the table's primary key and the physical guess (if any) for the row, along with some control information.

    You can access a row using predicates of the form WHERE ROWID = value, where value is the opaque structure returned by SELECT ROWID.

  • Access through the logical rowid is the fastest way to get to a specific row, although it can require more than one block access.

  • A row's logical rowid does not change as long as the primary key value does not change. This is less stable than the physical rowid, which stays immutable through all updates to the row.

  • Logical rowids can be stored in a column of the UROWID datatype

One difference between physical and logical rowids is that logical rowids cannot be used to see how a table is organized.

Note:

An opaque type is one whose internal structure is not known to the database. The database provides storage for the type. The type designer can provide access to the contents of the type by implementing functions, typically 3GL routines.

Guesses in Logical Rowids

When a row's physical location changes, the logical rowid remains valid even if it contains a guess, although the guess could become stale and slow down access to the row. Guess information cannot be updated dynamically. For secondary indexes on index-organized tables, however, you can rebuild the index to obtain fresh guesses. Note that rebuilding a secondary index on an index-organized table involves reading the base table, unlike rebuilding an index on an ordinary table.

Collect index statistics with the DBMS_STATS package or ANALYZE statement to keep track of the staleness of guesses, so Oracle does not use them unnecessarily. This is particularly important for applications that store rowids with guesses persistently in a UROWID column, then retrieve the rowids later and use them to fetch rows.

When you collect index statistics with the DBMS_STATS package or ANALYZE statement, Oracle checks whether the existing guesses are still valid and records the percentage of stale/valid guesses in the data dictionary. After you rebuild a secondary index (recomputing the guesses), collect index statistics again.

In general, logical rowids without guesses provide the fastest possible access for a highly volatile table. If a table is static or if the time between getting a rowid and using it is sufficiently short to make row movement unlikely, logical rowids with guesses provide the fastest access.

See Also:

Oracle Database Performance Tuning Guide for more information about collecting statistics

Rowids in Non-Oracle Databases

Oracle database applications can be run against non-Oracle database servers using SQL*Connect. The format of rowids varies according to the characteristics of the non-Oracle system. Furthermore, no standard translation to VARCHAR2/hexadecimal format is available. Programs can still use the ROWID datatype. However, they must use a nonstandard translation to hexadecimal format of length up to 256 bytes.

Rowids of a non-Oracle database can be stored in a column of the UROWID datatype.

See Also:

Overview of ANSI, DB2, and SQL/DS Datatypes

SQL statements that create tables and clusters can also use ANSI datatypes and datatypes from IBM's products SQL/DS and DB2. Oracle recognizes the ANSI or IBM datatype name that differs from the Oracle datatype name, records it as the name of the datatype of the column, and then stores the column's data in an Oracle datatype based on the conversions.

See Also:

Oracle Database SQL Reference for more information about the conversions

Overview of XML Datatypes

Oracle provides the XMLType datatype to handle XML data.

XMLType Datatype

XMLType can be used like any other user-defined type. XMLType can be used as the datatype of columns in tables and views. Variables of XMLType can be used in PL/SQL stored procedures as parameters, return values, and so on. You can also use XMLType in PL/SQL, SQL and Java, and through JDBC and OCI.

A number of useful functions that operate on XML content have been provided. Many of these are provided both as SQL functions and as member functions of XMLType. For example, function extract extracts a specific node(s) from an XMLType instance. You can use XMLType in SQL queries in the same way as any other user-defined datatypes in the system.

Overview of URI Datatypes

A URI, or uniform resource identifier, is a generalized kind of URL. Like a URL, it can reference any document, and can reference a specific part of a document. It is more general than a URL because it has a powerful mechanism for specifying the relevant part of the document. By using UriType, you can do the following:

  • Create table columns that point to data inside or outside the database.

  • Query the database columns using functions provided by UriType.

Overview of Data Conversion

In some cases, Oracle supplies data of one datatype where it expects data of a different datatype. This is allowed when Oracle can automatically convert the data to the expected datatype.

See Also:

Oracle Database SQL Reference for the rules for implicit datatype conversions