Oracle7 Server Utilities User's Guide | ![]() Library |
![]() Product |
![]() Contents |
![]() Index |
The information in this chapter falls into the following main categories:
General Syntactical Information:
Brackets enclose optional items. For example:
[ INTERNAL ]
If you include the optional item, omit the brackets.
Vertical Bar |
A vertical bar separates alternative items within brackets or braces. For example:
[ ASC | DESC ]
Type one of the items, but omit the vertical bar and brackets.
Braces { }
Braces enclose two or more alternative mandatory items. For example:
DEFINE { procedure_name | function_name }
Type one of the items, but omit the braces and vertical bar.
Ellipsis ...
An ellipsis represents an arbitrary number of one or more similar items. For example:
CONCAT ( column1, column2, ... columnN )
where N >= 3.
Type similar items and delimiters as required, but omit the ellipsis.
Symbols
Type the following symbols precisely as they appear in the syntax notation:
For details of the notation used in the syntax diagrams in this Reference, see the PL/SQL User's Guide and Reference or the preface in the Oracle7 Server SQL Reference.
The position must be surrounded by parentheses. The starting location may be specified as a column number, as * (next column), or *+n (next column plus an offset). The start and end locations may be separated with either a colon (:) or a dash (-).
The char_string and hex_string can be enclosed in either single quotation marks or double quotation marks. The hex_string is a string of hexadecimal digits, where each pair of digits corresponds to one byte in the field. The BLANKS keyword allows you to test a field to see if it consists entirely of blanks. It is necessary when you are loading delimited data and you cannot predict the length of the field, or when using a multi-byte character set that has multiple blanks.
There must not be any spaces between the operator and the operands on either side of it. Thus,
(1)='x'
is legal, while
(1) = 'x'
generates an error.
For more information, see "Specifying Delimiters" .
--This is a comment
All text to the right of the double hyphen is ignored, until the end of the line. Case 3 contains an example in a control file.
SKIP = n
LOAD = n
ERRORS = n
ROWS = n
BINDSIZE = n
SILENT = {FEEDBACK | ERRORS | DISCARDS | ALL}
DIRECT = {TRUE | FALSE}
PARALLEL = {TRUE | FALSE}
For example:
OPTIONS (BINDSIZE=100000, SILENT=(ERRORS, FEEDBACK) )
Values specified on the command line override values specified in the OPTIONS statement of the control file. The OPTIONS keyword file establishes default values that are easily changed from the command line.
RECOVERABLE
Loaded data is logged in the redo log. This option is the default for direct path loads. All conventional loads are recoverable.
UNRECOVERABLE
This option can be specified for a direct path load only. Loaded data is not logged, which improves load performance. (Other changes to the database are logged.) For details, see "Specifying UNRECOVERABLE" . This option cannot be specified with a conventional load.
COUNT DATA DATE FORMAT
OPTIONS PART POSITION
So if you had an inventory system with columns named PART, COUNT, and DATA, you would specify these column names within double quotation marks in your SQL*Loader control file. For example:
INTO TABLE inventory
(partnum INTEGER,
"PART" CHAR(15),
"COUNT" INTEGER,
"DATA" VARCHAR2(30))
See Appendix B, "Reserved Words", for a complete list of reserved words.
You use double quotation marks if the object name contains special characters other than those recognized by SQL ($, #, _), or if the name is case sensitive.
For example:
INFILE 'mydata.dat'
BADFILE 'mydata.bad'
Quotation Marks in Quoted Strings
SQL*Loader uses strings within double quotation marks and strings within single quotation marks in the control file. Each type of string can appear within the other.
INFILE 'homedir\data\"norm\mydata'
To put the escape character itself into a string, enter it twice, like this: \\
For example:
"so'\"far" or 'so\'"far' is parsed as so'"far
"'so\\far'" or '\'so\\far\'' is parsed as 'so\far'
"so\\\\far" or 'so\\\\far' is parsed as so\\far
Note: A double quote in the initial position cannot be escaped, therefore you should avoid creating strings with an initial quote.
Additional Information: To find out if your version of Oracle implements the backslash escape character for filenames, see your Oracle operating system-specific documentation.
For example, to load a file named "topdir\mydir\mydata", you must specify:
INFILE 'topdir\\mydir\\mydata'
INFILE 'topdir\mydir\myfile'
can be specified normally. Double backslashes are not needed.
Because the backslash is not recognized as an escape character, strings within single quotation marks cannot be embedded inside another string delimited by single quotation marks. This rule also holds for double quotation marks: A string within double quotation marks cannot be embedded inside another string delimited by double quotation marks.
"test\me"
then the backslash is not used as an escape character, and double backslashes are not required for file specifications.
However, if the log file shows the file processing options string as:
"testme"
then the backslash is treated as an escape character, and double backslashes are needed.
BEGINDATA
This keyword is used with the INFILE keyword, described in the next section. Case 1 contains an example.
If you omit BEGINDATA, SQL*Loader tries to interpret your data as control information, and you receive an error message. If the data is in a separate file, reaching the end of the control file signals that control information is complete, and BEGINDATA should not be used.
There should not be any spaces or other characters on the same line after the BEGINDATA clause. Otherwise, the line containing BEGINDATA is interpreted as the first line of data.
Do not put comments after BEGINDATA--they are also interpreted as data.
If the control file also contains the data to be loaded, specify a filename of "*". This specification works with the BEGINDATA keyword, described .
where:
INFILE or INDDN
filename
All filenames containing spaces or punctuation marks should be enclosed in single quotation marks. For more details, see "
Specifying Filenames and Database Objects" .
*
An asterisk (*) replaces a filename to specify that the data is in the control file. If multiple datafiles are specified, this specification must be first.
processing_options
This is the file-processing options string. It indicates datafile format. It also optimizes datafile reads. See "Specifying Datafile Format and Buffering" .
For each datafile, you can also specify a discard file and a bad file. These files should be declared after each datafile name. The following portion of a control file specifies four files:
INFILE mydat1.dat BADFILE mydat1.bad DISCARDFILE mydat1.dis
INFILE mydat2.dat
INFILE mydat3.dat DISCARDFILE mydat3.dis
INFILE mydat4.dat DISCARDMAX 10
For the first datafile (MYDAT1.DAT), both a bad file and discard file are explicitly named. So both files are created, if needed.
For the second datafile (MYDAT2.DAT), neither a bad file nor a discard file is specified. So only the bad file is created, if it is needed. If created, the bad file has a default filename and extension. The discard file is not created, even if rows are discarded.
For the third file (MYDAT3.DAT), the default bad file is created, if needed. A discard file with the given name is also created, if it is needed.
For the fourth file (MYDAT4.DAT), the default bad file is created, if needed. Because the DISCARDMAX option is used, SQL*Loader assumes that a discard file is wanted and creates it with the default name (MYDAT4.DSC), if it is needed.
Note: It is not possible to join physical records from separate datafiles into one logical record.
INFILE *
In the next example, you specify that the data is contained in a file named WHIRL with the default file extension or file type of DAT:
INFILE WHIRL
The following example specifies the full path to a file:
INFILE 'c:/topdir/subdir/datafile.dat'
Note: Filenames that include spaces or punctuation marks should be enclosed in single quotation marks. For more details on filename specification, see "Specifying Filenames and Database Objects" .
Additional Information: For details on the syntax of the file processing options string, see your Oracle operating system-specific documentation.
where RECSIZE is the size of a fixed-length record, and BUFFERS is the number of buffers to use for asynchronous I/O.
Note: This example is operating system-specific and may not work on your operating system. For details on the syntax of the file processing options string on your system, see your Oracle operating system-specific documentation.
To declare a file named MYDATA.DAT as a file that contains 80-byte records and tell SQL*Loader to use eight I/O buffers with this syntax, you would use the following clause:
INFILE 'mydata.dat' "RECSIZE 80 BUFFERS 8"
Note: This example uses the recommended convention of single quotation marks for filenames and double quotation marks for everything else. See "Specifying Filenames and Database Objects" for more details.
Additional Information: On some systems a new version of the file is created if a file with the same name already exists. See your Oracle operating system-specific documentation to find out if this is the case on your system.
To specify the name of this file, use the BADFILE or BADDN keyword, followed by the filename. If you do not specify a name for the bad file, the name defaults to the name of the datafile with an extension or file type of BAD. You can also specify the bad file from the command line with the BAD parameter described .
A filename specified on the command line is associated with the first INFILE or INDDN clause in the control file, overriding any bad file that may have been specified as part of that clause.
The bad file is created in the same record and file format as the datafile so that the data can be reloaded after corrections. The syntax is
where:
BADFILE or BADDN
Either keyword may be used.
bad_file_name
Any valid file specification, naming a file to receive rejected records.
Note: Filenames that include spaces or punctuation marks should be enclosed in single quotation marks. For more details on filename specification, see "Specifying Filenames and Database Objects" .
BADFILE UGH
In the next examples, you specify a bad file with filename BAD0001 and file extension or file type of REJ:
BADDN BAD0001.REJ
BADDN '/REJECT_DIR/BAD0001.REJ'
If a record is rejected on insert, then no part of that record is inserted into any table. For example, if data in a record is to be inserted into multiple tables, and most of the inserts succeed, but one insert fails; then all the inserts from that record are rolled back. The record is then written to the bad file, where it can be corrected and reloaded. Previous inserts from records without errors are not affected.
The log file indicates the Oracle error for each rejected record. Case 4 has an example of rejected records.
The discard file is created according to the following rules:
To create a discard file, use any of the following options:
In a Control File | On the Command Line | |
DISCARDFILE filename | DISCARD | |
DISCARDDN filename | DISCARDMAX | |
DISCARDS | ||
DISCARDMAX | ||
where:
DISCARDFILE or DISCARDDN
Either keyword may be used.
discard_file_name
Any valid filename, specifying a file to receive discard records.
Note: Filenames that include spaces or punctuation marks should be enclosed in single quotation marks. For more details on filename specification, see "Specifying Filenames and Database Objects" .
The default filename is the name of the datafile, and the default file extension or file type is DSC. A discard filename specified on the command line overrides one specified in the control file. If a discard file with that name already exists, it is either overwritten or a new version is created, depending on your operating system.
The discard file is created with the same record and file format as the datafile. So it can easily be used for subsequent loads with the existing control file, after changing the WHEN clauses or editing the data.
DISCARDFILE CIRCULAR
In this example, you specify a file extension or file type of MAY:
DISCARDDN NOTAPPL.MAY
In the next example, you specify a full path to filename FORGET.ME:
DISCARDFILE '/DISCARD_DIR/FORGET.ME'
No records are discarded if an INTO TABLE keyword is specified without a WHEN clause. An attempt is made to insert every record into such a table. So records may be rejected, but none are discarded.
Case 4 has an example of a discard file.
where n must be an integer. When the discard limit is reached, processing of that datafile terminates and continues with the next datafile, if one exists.
You can specify a different number of discards for each datafile. Alternatively, if the number of discards is only specified once, then the maximum number of discards is the same for all files.
If you specify a maximum number of discards, but no discard filename; SQL*Loader creates a discard file with the default filename and file extension or file type. Case 4 has an example.
A filename specified on the command line goes with the first INFILE or INDDN clause in the control file, overriding any bad file that may have been specified as part of that clause.
During a direct path load, data converts directly into the database character set. As a consequence, the direct path load method allows data in a character set that is not supported by your terminal to be loaded.
When data conversion occurs, it is essential that the target character set contains a representation of all characters that exist in the data. Otherwise, characters that have no equivalent in the target character set are converted to a default character, with consequent loss of data. When using the direct path, load method the database character set should be a superset of, or equivalent to, the datafile character sets. Similarly, when using the conventional path, the session character set should be a superset of, or equivalent to, the datafile character sets.
The character set used in each input file is specified with the CHARACTERSET keyword.
Using the CHARACTERSET keyword causes character data to be automatically converted when it is loaded into Oracle. Only CHAR, DATE, and numeric EXTERNAL fields are affected. If the CHARACTERSET keyword is not specified, then no conversion occurs.
The syntax for this option is:
CHARACTERSET character_set_spec
where character_set_spec is the acronym used by Oracle to refer to your particular encoding scheme.
Additional Information: For more information on supported character sets, code pages, and the NLS_LANG parameter, see the National Language Support section of the Oracle7 Server Reference.
Any data included after the BEGINDATA statement is also assumed to be in the character set specified for your session by the NLS_LANG parameter. Data that uses a different character set must be in a separate file.
This section describes those methods.
INSERT
Returns an error and terminates the load if the table contains data. This option only loads data into empty tables. INSERT is the default.
Extends the table by adding new rows.
Deletes the rows in the table and loads the new data in its place.
TRUNCATEs the table and loads the new data in place of the old.
Warning: When the REPLACE or TRUNCATE keyword is specified, the entire table is replaced, not individual rows. After the rows are successfully deleted, a commit is issued. You cannot recover the data that was in the table before the load, unless it was saved with Export or a comparable utility.
The remainder of this section provides additional detail on these options.
The row deletes cause any delete triggers defined on the table to fire. If DELETE CASCADE has been specified for the table, then the cascaded deletes are carried out, as well. For more information on cascaded deletes, see the "Data Integrity" chapter of the Oracle7 Server Concepts manual.
Once the integrity constraints have been disabled, DELETE CASCADE is no longer defined for the table. If the DELETE CASCADE functionality is needed, then the contents of the table must be manually deleted before the load begins.
The table must be in your schema, or you must have the DELETE ANY TABLE privilege.
If the direct path load method is used, any indexes that run out of space are left in direct load state. They must be dropped before the load can continue. Other indexes are valid provided no other errors occurred. (See "Indexes Left in Direct Load State" for other reasons why an index might be left in direct load state.)
SQLLDR USERID=scott/tiger CONTROL=FAST1.CTL DIRECT=TRUE SKIP=345
To continue a discontinued direct path load involving multiple tables, inspect the SQL*Loader log file to find out how many records were loaded into each table. If the numbers are the same, you can use the previously described simple continuation.
LOAD DATA...
at the start of the control file, specify:
...
INTO TABLE emp
SKIP 2345
...
INTO TABLE dept
SKIP 514
...
If you specify CONTINUE_LOAD, you cannot use the command-line parameter SKIP. You must use the table-level SKIP clause. If you specify LOAD, you can optionally use the command-line parameter SKIP, but you cannot use the table-level SKIP clause.
CONCATENATE
CONTINUEIF
CONCATENATE is appropriate in the simplest case, when SQL*Loader should always add the same number of physical records to form one logical record. The syntax is:
CONCATENATE n
where n indicates the number of physical records to combine.
If the number of physical records to be continued varies, then CONTINUEIF must be used. The keyword CONTINUEIF is followed by a condition that is evaluated for each physical record, as it is read. For example, two records might be combined if there were a pound sign (#) in character position 80 of the first record. If any other character were there, the second record would not be added to the first. The full syntax for CONTINUEIF adds even more flexibility:
where:
THIS
If the condition is true in this record, then the next physical record is read and concatenated to the current physical record, continuing until the condition is false. If the condition is false in the current record, then the current physical record is the last physical record of the current logical record. THIS is the default.
NEXT
If the condition is true in the next record, then the next physical record is concatenated to the current record, continuing until the condition is false.
If the condition is false in the next record, then the current physical record is the last physical record of the current logical record.
pos_spec
Indicates starting and ending column numbers in the physical record, as shown below:
Column numbers start with 1. Either a hyphen or a colon is acceptable (start-end or start:end).
If you omit end, the length of the continuation field is the length of the byte string or character string. If you use end, and the length of the resulting continuation field is not the same as that of the byte string or character string, the shorter one is padded. Character strings are padded with blanks, hexadecimal strings with zeros.
LAST
This test is similar to THIS, but the test is always against the last non-blank character. If the last non-blank character in this physical record meets the test, then the next physical record is read and concatenated to the current physical record, continuing until the condition is false. If the condition is false in the current record, then the current physical record is the last physical record of the current logical record.
operator
The supported operators are equal and not equal:
= != ¬= <>
For the equal operator, the field and comparison string must match exactly for the condition to be true. For the not equal operator, they may differ in any character.
char_string
A string of characters to be compared to the continuation field defined by start and end, according to the operator. The string must be enclosed in double or single quotation marks. The comparison is made character by character, blank padding on the right if necessary.
X'hex_string'
A string of bytes in hexadecimal format, used in the same way as the character string above. X'1FB033' would represent the three bytes with values 1F, B0 and 33 (hex).
Note: The positions in the CONTINUEIF clause refer to positions in each physical record. This is the only time you refer to character positions in physical records. All other references are to logical records.
For CONTINUEIF THIS and CONTINUEIF NEXT, the continuation field is removed from all physical records before the logical record is assembled. This allows data values to span the records with no extra characters (continuation characters) in the middle. Two examples showing CONTINUEIF THIS and CONTINUEIF NEXT follow:
CONTINUEIF THIS CONTINUEIF NEXT
(1:2) = '%%' (1:2) ='%%'
Assume physical data records 12 characters long and that a period means a space:
%%aaaaaaaa.... ..aaaaaaaa....
%%bbbbbbbb.... %%bbbbbbbb....
..cccccccc.... %%cccccccc.... %%dddddddddd.. ..dddddddddd..
%%eeeeeeeeee.. %%eeeeeeeeee..
..ffffffffff.. %%ffffffffff..
The logical records would be the same in each case:
aaaaaaaa....bbbbbbbb....cccccccc....
dddddddddd..eeeeeeeeee..ffffffffff..
Notes:
If record2 does not have an asterisk in column 1, then it is still appended to record1, but record3 begins a new logical record.
CONTINUEIF THIS (1) = "*"
In the next example, you specify that if the current physical record (record1) has a comma in the last non-blank data column. Then the next physical record (record2) should be appended to it. If a record does not have a comma in the last column, it is the last physical record of the current logical record.
CONTINUEIF LAST = ","
In the last example, you specify that if the next physical record (record2) has a "10" in columns 7 and 8. Then it should be appended to the preceding physical record (record1). If a record does not have a "10" in columns 7 and 8, then it begins a new logical record.
CONTINUEIF NEXT (7:8) = '10'
Case 4 shows the CONTINUEIF clause in use.
To begin an INTO TABLE clause, use the keywords INTO TABLE, followed by the name of the Oracle table that is to receive the data.
The table must already exist. The table name should be enclosed in double quotation marks if it is the same as any SQL or SQL*Loader keyword, if it contains any special characters, or if it is case sensitive.
INTO TABLE SCOTT."COMMENT"
INTO TABLE SCOTT."comment"
INTO TABLE SCOTT."-COMMENT"
The user running SQL*Loader should have INSERT privileges on the table. Otherwise, the table name should be prefixed by the username of the owner as follows:
INTO TABLE SOPHIA.EMP
The WHEN clause appears after the table name and is followed by one or more field conditions.
For example, the following clause indicates that any record with the value "q" in the fifth column position should be loaded:
WHEN (5) = 'q'
A WHEN clause can contain several comparisons provided each is preceded by AND. Parentheses are optional, but should be used for clarity with multiple comparisons joined by AND. For example
WHEN (DEPTNO = '10') AND (JOB = 'SALES')
To evaluate the WHEN clause, SQL*Loader first determines the values of all the fields in the record. Then the WHEN clause is evaluated. A row is inserted into the table only if the WHEN clause is true.
Field conditions are discussed in detail . Case 5
shows the WHEN clause in use.
You can override the delimiter for any given column by specifying it after the column name. Case 3 contains an example. See "Specifying Delimiters"
for more information on delimiter specification.
If the control file definition explicitly states that a field's starting position is beyond the end of the logical record, then SQL*Loader always defines the field as null. If a field is defined with a relative position (such as DNAME and LOC in the example below), and the record ends before the field is found; then SQL*Loader could either treat the field as null or generate an error. SQL*Loader uses the presence or absence of the TRAILING NULLCOLS clause to determine the course of action.
For example, if the following data
10 Accounting
is read with the following control file
INTO TABLE dept
TRAILING NULLCOLS ( deptno CHAR TERMINATED BY " ",
dname CHAR TERMINATED BY WHITESPACE,
loc CHAR TERMINATED BY WHITESPACE
)
and the record ends after DNAME. The remaining LOC field is set to null. Without the TRAILING NULLCOLS clause, an error would be generated due to missing data.
Case 7 provides an example of using TRAILING NULLCOLS.
By default, SQL*Loader does not use SINGLEROW when APPENDing rows to a table. Instead, index entries are put into a separate, temporary storage area and merged with the original index at the end of the load. This method achieves better performance and produces an optimal index, but it requires extra storage space. During the merge, the original index, the new index, and the space for new entries all simultaneously occupy storage space.
With the SINGLEROW option, storage space is not required for new index entries or for a new index. The resulting index may not be as optimal as a freshly sorted one, but it takes less space to produce. It also takes more time, since additional UNDO information is generated for each index insert. This option is suggested for use when:
A field condition is similar to the condition in the CONTINUEIF clause, with two important differences. First, positions in the field condition refer to the logical record, not to the physical record. Second, you may specify either a position in the logical record or the name of a field that is being loaded.
where:
start
Specifies the starting position of the comparison field in the logical record.
end
Specifies the ending position of the comparison field in the logical record. Either start:end or start-end is acceptable. If you omit end, the length of the field is determined by the length of the comparison string. If the lengths are different, the shorter field is padded: character strings are padded with blanks, hexadecimal strings are padded with zeroes.
column_name
The name of a column in the database table. If column_name is used instead of start:end, then the specification for that column defines the comparison field. Column_name must match exactly the name of the column in the table's database definition. Use quotation marks around the column name if it is a SQL or SQL*Loader keyword, contains special characters, or is case sensitive. For more information, see "Specifying Filenames and Database Objects" .
operator
A comparison operator for either equal or not equal:
= != ¬= <>
'char string'
A string of characters enclosed within single or double quotation marks that is compared to the comparison field. If the comparison is true, then this row is inserted into the table.
X'hex string'
A byte string in hexadecimal format used in the same way as the character string above.
BLANKS
A keyword denoting an arbitrary number of blanks, described next.
For example, use the following clause to load a blank field as null:
column_name ... NULLIF column_name=BLANKS
The BLANKS keyword only recognizes blanks, not tabs. It can be used in place of a literal string in any field comparison. The condition is TRUE whenever the column is entirely blank.
The BLANKS keyword also works for fixed-length fields. Using it is the same as specifying an appropriately-sized literal string of blanks. For example, the following specifications are equivalent:
fixed_field CHAR(2) NULLIF (fixed_field)=BLANKS
fixed_field CHAR(2) NULLIF (fixed_field)=" "
Note: There can be more than one "blank" in a multi-byte character set. It is a good idea to use the BLANKS keyword with these character sets instead of specifying a string of blank characters. The character string will match only a specific sequence of blank characters, while the BLANKS keyword will match combinations of different blank characters. For more information on multi-byte character sets, see page 5 - 24.
NULLIF (1:4)="_"
compares the data in position 1:4 with 4 blanks. If position 1:4 contains 4 blanks, then the clause evaluates as true.
Hexadecimal strings are padded with hexadecimal zeroes. The clause
NULLIF (1:4)=X'FF'
compares position 1:4 to hex 'FF000000'.
A column specification is the name of the column, followed by a specification for the value to be put in that column. The list of columns is enclosed by parentheses and separated with commas as follows:
( columnspec, columnspec, ... )
If the value is to be generated by SQL*Loader, the specification includes the keyword RECNUM, the SEQUENCE function, or the keyword CONSTANT. See "Generating Data" .
If the column's value is read from the datafile, the data field that contains the column's value is specified. In this case, the column specification includes a column name that identifies a column in the database table, and a field specification that describes a field in a data record. The field specification includes position, datatype, null restrictions, and defaults.
You may only specify one datatype for each field. If you omit the datatype, a type of CHAR is assumed.
"Specifying Datatypes" describes how SQL*Loader datatypes are converted into the Oracle datatypes and gives detailed information on each of SQL*Loader's datatypes.
Before specifying the datatype, however, the field's position must be specified. That is the subject of the next section.
where:
start
The starting column of the data field in the logical record. The first character position in a logical record is 1.
end
Indicates the ending position of the data field in the logical record. Either start:end or start-end is acceptable. If end is omitted, the length of the field is derived from the datatype in the datafile. (See the sections on each datatype. Note that CHAR data specified without a start and end is assumed to be length 1.) If it is impossible to derive a length from the datatype, an error message results.
*
Indicates that the data field follows immediately after the previous field. If * is used for the first data field in the control file, the field is assumed to be at the beginning of the logical record. When * is used for position, the length of the field is derived from the datatype.
+n
An offset, specified as +n, may be used with * to offset this field from the previous one. n characters are skipped before reading the value for this field.
You may omit POSITION entirely. If you do, the position specification for the data field is the same as if POSITION(*) had been used.
For example
ENAME POSITION (1:20) CHAR
EMPNO POSITION (22-26) INTEGER EXTERNAL
ALLOW POSITION (*+2) INTEGER EXTERNAL TERMINATED BY "/"
Column ENAME is character data in positions 1 through 20, followed by column EMPNO, which is presumably numeric data in columns 22 through 26. Column ALLOW is offset from the end of EMPNO by +2. So it starts in column 28 and continues until a slash is encountered.
To fix the problem, inspect the datafile for tabs and adjust the POSITION specifications, or else use delimited fields.
The use of delimiters to specify relative positioning of fields is discussed in detail beginning . Especially note how the delimiter WHITESPACE can be used.
Thus, when a subsequent INTO TABLE clause begins, the position is not set to the beginning of the logical record automatically. This allows multiple INTO TABLE clauses to process different parts of the same physical record. For an example, see the second example in the section "Extracting Multiple Logical Records" beginning .
A logical record may contain data for one of two tables, but not both. In this case, you would reset POSITION. Instead of omitting the position specification or using POSITION(*+n) for the first field in the INTO TABLE clause, use POSITION(1) or POSITION(n).
Some examples follow:
SITEID POSITION (*) SMALLINT
SITELOC POSITION (*) INTEGER
If these were the first two column specifications, SITEID would begin in column1, and SITELOC would begin in the column immediately following.
ENAME POSITION (1:20) CHAR
EMPNO POSITION (22-26) INTEGER EXTERNAL
ALLOW POSITION (*+2) INTEGER EXTERNAL TERMINATED BY "/"
Column ENAME is character data in positions 1 through 20, followed by column EMPNO which is presumably numeric data in columns 22 through 26. Column ALLOW is offset from the end of EMPNO by +2, so it starts in column 28 and continues until a slash is encountered.
In the first case, it is common for the INTO TABLE statements to refer to the same table. This section illustrates the different ways to use multiple INTO TABLE statements and shows you how to use the POSITION keyword.
Note: A key point when using multiple INTO TABLE statements is that field scanning continues from where it left off when a new INTO TABLE statement is processed. The remainder of this section details important ways to make use of that behavior. It also describes alternative ways using fixed field locations or the POSITION keyword.
In this example, SQL*Loader treats a single physical record in the input file as two logical records and uses two INTO TABLE clauses to load the data into the EMP table. For example, if the data looks like
1119 Smith 1120 Snyder
1121 Spellini 1130 Thompson
then the following control file extracts the logical records:
INTO TABLE emp
(empno POSITION(1:4) INTEGER EXTERNAL,
ename POSITION(6:15) CHAR)
INTO TABLE emp
(empno POSITION(17:20) INTEGER EXTERNAL,
ename POSITION(21:30) CHAR)
INTO TABLE emp
(empno INTEGER EXTERNAL TERMINATED BY " ",
ename CHAR TERMINATED BY WHITESPACE)
INTO TABLE emp
(empno INTEGER EXTERNAL TERMINATED BY " ",
ename CHAR) TERMINATED BY WHITESPACE)
The important point in this example is that the second EMPNO field is found immediately after the first ENAME, although it is in a separate INTO TABLE clause. Field scanning does not start over from the beginning of the record for a new INTO TABLE clause. Instead, scanning continues where it left off.
To force record scanning to start in a specific location, you use the POSITION keyword. That mechanism is described next.
1 50 Manufacturing -- DEPT record
2 1119 Smith 50 -- EMP record
2 1120 Snyder 50
1 60 Shipping
2 1121 Stevens 60
A record ID field distinguishes between the two formats. Department records have a "1" in the first column, while employee records have a "2". The following control file uses exact positioning to load this data:
INTO TABLE dept
WHEN recid = 1
(recid POSITION(1:1) INTEGER EXTERNAL,
deptno POSITION(3:4) INTEGER EXTERNAL,
ename POSITION(8:21) CHAR)
INTO TABLE emp
WHEN recid <> 1
(recid POSITION(1:1) INTEGER EXTERNAL,
empno POSITION(3:6) INTEGER EXTERNAL,
ename POSITION(8:17) CHAR,
deptno POSITION(19:20) INTEGER EXTERNAL)
INTO TABLE dept
WHEN recid = 1
(recid INTEGER EXTERNAL TERMINATED BY WHITESPACE,
deptno INTEGER EXTERNAL TERMINATED BY WHITESPACE,
dname CHAR TERMINATED BY WHITESPACE)
INTO TABLE emp
WHEN recid <> 1
(recid POSITION(1) INTEGER EXTERNAL TERMINATED BY ' ',
empno INTEGER EXTERNAL TERMINATED BY ' '
ename CHAR TERMINATED BY WHITESPACE,
deptno INTEGER EXTERNAL TERMINATED BY ' ')
The POSITION keyword in the second INTO TABLE clause is necessary to load this data correctly. This keyword causes field scanning to start over at column 1 when checking for data that matches the second format. Without it, SQL*Loader would look for the RECID field after DNAME.
For delimited data, proper use of the POSITION keyword is essential for achieving the expected results.
When the POSITION keyword is not used, multiple INTO TABLE clauses process different parts of the same (delimited data) input record, allowing multiple tables to be loaded from one record. When the POSITION keyword is used, multiple INTO TABLE clauses can process the same record in different ways, allowing multiple formats to be recognized in one input file.
SQL*Loader inserts as many rows as are specified by the LOAD keyword. The LOAD keyword is required in this situation. The SKIP keyword is not permitted.
SQL*Loader is optimized for this case. Whenever SQL*Loader detects that only generated specifications are used, it ignores any specified datafile -- no read I/O is performed.
In addition, no memory is required for a bind array. If there are any WHEN clauses in the control file, SQL*Loader assumes that data evaluation is necessary, and input records are read.
CONSTANT value
CONSTANT data is interpreted by SQL*Loader as character input. It is converted, as necessary, to the database column type.
You may enclose the value within quotation marks, and must do so if it contains white space or reserved words. Be sure to specify a legal value for the target column. If the value is bad, every row is rejected.
Numeric values larger than 2**32 - 1 (4,294,967,295) must be enclosed in quotes.
Note: Do not use the CONSTANT keyword to set a column to null. To set a column to null, do not specify that column at all. Oracle automatically sets that column to null when loading the row. The combination of CONSTANT and a value is a complete column specification.
column_name RECNUM
column_name SYSDATE
The database column must be of type CHAR or DATE. If the column is of type CHAR, then the date is loaded in the form 'dd-mon-yy.' After the load, it can be accessed only in that form. If the system date is loaded into a DATE column, then it can be accessed in a variety of forms that include the time and the date.
A new system date/time is used for each array of records inserted in a conventional path load and for each block of records loaded during a direct path load.
SEQUENCE takes two optional arguments. The first argument is the starting value. The second is the increment. If the start point is a positive integer n, the first row inserted has a value of n for that column. The values of successive rows are increased by the increment. However, both the starting value and the increment default to 1.
where:
SEQUENCE
Use the SEQUENCE keyword to specify the value for a column.
n
The sequence starts with the integer value n. The value must be positive or zero. Default value is 1.
COUNT
The sequence starts with the number of rows already in the table, plus the increment.
MAX
The sequence starts with the current maximum value for the column, plus the increment.
increment
The sequence is incremented by this amount for each successive row. The default increment is 1. The increment must be positive.
If a row is rejected (that is, it has a format error or causes an Oracle error), the generated sequence numbers are not reshuffled to mask this. If four rows are assigned sequence numbers 10, 12, 14, and 16 in a particular column, and the row with 12 is rejected; the three rows inserted are numbered 10, 14, and 16, not 10, 12, 14. This allows the sequence of inserts to be preserved despite data errors. When you correct the rejected data and reinsert it, you can manually set the columns to agree with the sequence.
Case 3 provides an example of using SEQUENCE.
Sometimes, you might want to generate different sequence numbers for each INTO TABLE clause. For example, your data format might define three logical records in every input record. In that case, you can use three INTO TABLE clauses, each of which inserts a different part of the record into the same table.
To generate sequence numbers for these records, you must generate unique numbers for each of the three inserts. There is a simple technique to do so. Use the number of table-inserts per record as the sequence increment and start the sequence numbers for each insert with successive numbers.
Accounting Personnel Manufacturing
Shipping Purchasing Maintenance
...
You could use the following control file to generate unique department numbers:
INTO TABLE dept
(deptno sequence(1, 3),
dname position(1:14) char)
INTO TABLE dept
(deptno sequence(2, 3),
dname position(16:29) char)
INTO TABLE dept
(deptno sequence(3, 3),
dname position(31:44) char)
The first INTO TABLE clause generates department number 1, the second number 2, and the third number 3. They all use 3 as the sequence increment (the number of department names in each record). This control file loads Accounting as department number 1, Personnel as 2, and Manufacturing as 3. The sequence numbers are then incremented for the next record, so Shipping loads as 4, Purchasing as 5, and so on.
SQL*Loader extracts data from a field in the input file, guided by the datatype specification in the control file. SQL*Loader then sends the field to the server to be stored in the appropriate column (as part of an array of row inserts). The server does any necessary data conversion to store the data in the proper internal format. The "Data Conversion and Datatype Specification" section beginning contains diagrams that illustrate these points.
The datatype of the data in the file does not necessarily have to be the same as the datatype of the column in the Oracle table. Oracle automatically performs conversions, but you need to ensure that the conversion makes sense and does not generate errors. For instance, when a datafile field with datatype CHAR is loaded into a database column with datatype NUMBER, you must make sure that the contents of the character field represent a valid number.
Note: SQL*Loader does not contain datatype specifications for Oracle internal datatypes such as NUMBER or VARCHAR2. SQL*Loader's datatypes describe data that can be produced with text editors (character datatypes) and with standard programming languages (native datatypes). However, although SQL*Loader does not recognize datatypes like NUMBER and VARCHAR2, any data that Oracle is capable of converting may be loaded into these or other database columns.
INTEGER ZONED SMALLINT VARCHAR FLOAT GRAPHIC DOUBLE GRAPHIC EXTERNAL BYTEINT VARGRAPHIC (packed) DECIMAL RAW MLSLABEL (Trusted Oracle)
Since these datatypes contain binary data, most of them do not readily transport across operating systems. (See "Loading Data Across Different Operating Systems" .) RAW data and GRAPHIC data is the exceptions. SQL*Loader does not attempt to interpret these datatypes, but simply stores them "as is".
Additional Information: Native datatypes cannot be specified with delimiters. The size of the native datatypes INTEGER, SMALLINT, FLOAT, and DOUBLE are determined by the host operating system. Their size is fixed -- it cannot be overridden in the control file. (Refer to your Oracle operating system-specific documentation for more information.) The sizes of the other native datatypes may be specified in the control file.
INTEGER
SMALLINT
Additional Information: This is the SHORT INT datatype in the C programming language. One way to determine its length is to make a small control file with no data and look at the resulting log file. This length cannot be overridden in the control file. See your Oracle operating system-specific documentation for details.
FLOAT
DOUBLE
The syntax for this datatype is
BYTEINT
An example is
(column1 position(1) BYTEINT,
column2 BYTEINT,
...
)
The syntax for this datatype is:
where precision is the number of digits in the number, and scale (if given) is the number of digits to the right of the (implied) decimal point. For example:
sal POSITION(32) ZONED(8),
specifies an 8-digit integer starting at position 32.
The syntax for the this datatype is:
where:
precision
Is the number of digits in the value. The character length of the field, as computed from digits, is (digits + 2) / 2, rounded down.
scale
Is the scaling factor, or number of digits to the right of the decimal point. Default is zero (indicating an integer). May be greater than the number of digits but may not be negative.
For example,
sal DECIMAL (7,2)
would load a number equivalent to +12345.67. In the data record, this field would take up 4 bytes, as shown in Figure 5 - 1. (The byte length of a DECIMAL field is equivalent to (N+1)/2, rounded up, where N is the number of digits in the value, and one is added for the sign.)
Figure 5 - 1. Packed Decimal Representation of +12345.67
The syntax for this datatype is
The length of this field is the number of bytes specified in the control file. This length is limited only by the length of the target column in the database and by memory resources.
The syntax for this datatype is
For both GRAPHIC and GRAPHIC EXTERNAL, if you specify POSITION(start:end) you give the exact location of the field in the logical record. If you specify the length after the GRAPHIC (EXTERNAL) keyword, however, then you give the number of double-byte graphic characters. That value is multiplied by 2 to find the length of the field in bytes. If the number of graphic characters is specified, then any length derived from POSITION is ignored.
The syntax for this datatype is:
where:
GRAPHIC
Data is double-byte characters.
EXTERNAL
First and last characters are ignored.
graphic_char_length
Length in DBCS (see GRAPHIC, above)
For example, let [ ] represent shift-in and shift-out characters, and let # represent any double-byte character.
To describe ####, use "POSITION(1:4) GRAPHIC" or "POSITION(1) GRAPHIC(2)".
To describe [####], use "POSITION(1:6) GRAPHIC EXTERNAL" or "POSITION(1) GRAPHIC EXTERNAL(2)".
Additional Information: The size of the length subfield is the size of the SQL*Loader SMALLINT datatype on your system (C type SHORT INT). See SMALLINT for more information.
The length of the current field is given in the first two bytes. This length is a count of graphic (double-byte) characters. So it is multiplied by two to determine the number of bytes to read.
The syntax for this datatype is
A maximum length specified after the VARGRAPHIC keyword does not include the size of the length subfield. The maximum length specifies the number of graphic (double byte) characters. So it is also multiplied by two to determine the maximum length of the field in bytes.
The default maximum field length is 4K graphic characters, or 8 Kb
(2 * 4K). It is a good idea to specify a maximum length for such fields whenever possible, to minimize memory requirements. See "Determining the Size of the Bind Array" for more details.
The POSITION clause, if used, gives the location of the length subfield, not of the first graphic character. If you specify POSITION(start:end), the end location determines a maximum length for the field. Both start and end identify single-character (byte) positions in the file. Start is subtracted from (end + 1) to give the length of the field in bytes. If a maximum length is specified, it overrides any maximum length calculated from POSITION.
If a VARGRAPHIC field is truncated by the end of the logical record before its full length is read, a warning is issued. Because a VARCHAR field's length is embedded in every occurrence of the input data for that field, it is assumed to be accurate.
VARGRAPHIC data cannot be delimited.
Additional Information: The size of the length subfield is the size of the SQL*Loader SMALLINT datatype on your system (C type SHORT INT). See SMALLINT for more information.
The syntax for this datatype is:
A maximum length specified in the control file does not include the size of the length subfield. If you specify the optional maximum length after the VARCHAR keyword, then a buffer of that size is allocated for these fields.
The default buffer size is 4 Kb. Specifying the smallest maximum length that is needed to load your data can minimize SQL*Loader's memory requirements, especially if you have many VARCHAR fields. See "Determining the Size of the Bind Array" for more details.
The POSITION clause, if used, gives the location of the length subfield, not of the first text character. If you specify POSITION(start:end), the end location determines a maximum length for the field. Start is subtracted from (end + 1) to give the length of the field in bytes. If a maximum length is specified, it overrides any length calculated from POSITION.
If a VARCHAR field is truncated by the end of the logical record before its full length is read, a warning is issued. Because a VARCHAR field's length is embedded in every occurrence of the input data for that field, it is assumed to be accurate.
VARCHAR data cannot be delimited.
column1 POSITION(1:6) INTEGER
then a warning is issued, and the proper length (4) is used. In this case, the log file shows the actual length used under the heading "Len" in the column table:
Column Name Position Len Term Encl Datatype
----------------------- --------- ----- ---- ---- ---------
COLUMN1 1:6 4 INTEGER
Additional Information: In addition, the MLSLABEL character datatype exists in Trusted Oracle. See the Trusted Oracle7 Server Administrator's Guide for more information on this datatype.
A field of datatype CHAR may also be variable-length delimited or enclosed. See "Specifying Delimiters" .
Attention: If the column in the database table is defined as LONG, you must explicitly specify a maximum length (maximum for a LONG is 2 gigabytes) either with a length specifier on the CHAR keyword or with the POSITION keyword. This guarantees that a large enough buffer is allocated for the value and is necessary even if the data is delimited or enclosed.
For example:
LOAD DATA
INTO TABLE DATES (COL_A POSITION (1:15) DATE "DD-Mon-YYYY")
BEGINDATA
1-Jan-1991
1-Apr-1991 28-Feb-1991
Attention: Whitespace is ignored and dates are parsed from left to right unless delimiters are present.
The length specification is optional, unless a varying-length date mask is specified. In the example above, the date mask specifies a fixed-length date format of 11 characters. SQL*Loader counts 11 characters in the mask, and therefore expects a maximum of 11 characters in the field, so the specification works properly. But, with a specification such as
DATE "Month dd, YYYY"
the date mask is 14 characters, while the maximum length of a field such as
September 30, 1991
is 18 characters. In this case, a length must be specified. Similarly, a length is required for any Julian dates (date mask "J")--a field length is required any time the length of the date string could exceed the length of the mask (that is, the count of characters in the mask).
If an explicit length is not specified, it can be derived from the POSITION clause. It is a good idea to specify the length whenever you use a mask, unless you are absolutely sure that the length of the data is less than, or equal to, the length of the mask.
An explicit length specification, if present, overrides the length in the POSITION clause. Either of these overrides the length derived from the mask. The mask may be any valid Oracle date mask. If you omit the mask, the default Oracle date mask of "dd-mon-yy" is used. The length must be enclosed in parentheses and the mask in quotation marks. Case 3 has an example of the DATE datatype.
A field of datatype DATE may also be specified with delimiters. For more information, see "Specifying Delimiters" .
A date field that consists entirely of whitespace produces an error unless NULLIF BLANKS is specified. For more information, see "Loading All-Blank Fields" .
The data is a number in character form (not binary representation). As such, these datatypes are identical to CHAR and are treated identically, with one exception: the use of DEFAULTIF. If you want the default to be null, use CHAR. If you want it to be zero, use EXTERNAL.
Numeric EXTERNAL may be specified with lengths and delimiters, just like CHAR data. Length is optional, but if specified, overrides POSITION.
The syntax for this datatype is:
Delimited data can be TERMINATED or ENCLOSED.
If two delimiter characters are encountered next to each other, a single occurrence of the delimiter character is used in the data value. For example, 'DON''T' is stored as DON'T. However, if the field consists of just two delimiter characters, its value is null. You may specify a TERMINATED BY clause, an ENCLOSED BY clause, or both. If both are used, the TERMINATED BY clause must come first.
The syntax for delimiter specifications is:
where:
TERMINATED
Data is read until first occurrence of a delimiter.
BY
This is an optional keyword for readability.
WHITESPACE
Delimiter is any whitespace character, including linefeed, formfeed, or carriage return. (Only used with TERMINATED, not with ENCLOSED.)
OPTIONALLY
Data may be enclosed by the indicated character. If SQL*Loader finds a first occurrence of the character, it reads the data value until it finds the second occurrence. If the data is not enclosed, the data is read as a terminated field. If optional enclosure is specified, there must be a TERMINATED BY clause--either locally in the field definition, or globally, in the FIELDS clause.
ENCLOSED
The data value is found between two delimiters.
char
Delimiter is the single character char.
X'hex-byte'
Delimiter is the single character that has the value specified by hex-byte in the character encoding scheme, such as X`1F' (equivalent to 31 decimal). "X" must be uppercase.
Note: Due to National Language Support (NLS) requirements, hex 00 cannot be used as a separator. All other hex values are supported.
AND
This keyword specifies a trailing enclosure delimiter, which may be different from the initial enclosure delimiter. If the AND clause is not present, then the initial and trailing enclosure delimiters are the same.
Here are some examples, with samples of the data they describe:
TERMINATED BY ',' a data string,
ENCLOSED BY '"' "a data string"
TERMINATED BY ',' ENCLOSED BY '"' "a data string",
ENCLOSED BY "(" AND ')' (a data string)
(The delimiters are left paren's, ((, and right paren's, )).)
with this field specification:
ENCLOSED BY "(" AND ")"
puts the following string into the database:
The delimiters are left paren's, (, and right paren's, ).
For this reason, problems can arise when adjacent fields use the same delimiters. For example, the following specification:
field1 TERMINATED BY "/"
field2 ENCLOSED by "/"
the following data will be interpreted properly:
This is the first string/ /This is the second string/
But if field1 and field2 were adjacent, then the results would be incorrect, because
This is the first string//This is the second string/
would be interpreted as a single character string with a "/" in the middle of it, and that string would belong to field1.
For example, if
position(1:10) char(15)
is specified, then the length of the field is 15.
If the expected delimiter is absent and no maximum length has been specified, then the end of record terminates the field. If TRAILING NULLCOLS is specified, remaining fields are null. If either the delimiter or the end of record produce a field that is longer than the specified maximum, SQL*Loader generates an error.
"Month dd, yyyy"
then "May 3, 1991" would occupy 11 character positions in the record, while "January 31, 1992" would occupy 16.
If starting and ending positions are specified, however, then the length calculated from the position specification overrides a length derived from the mask. A specified length such as "DATE (12)" overrides either of those. If the date field is also specified with terminating or enclosing delimiters, then the length specified in the control file is interpreted as a maximum length for the field.
In general, the problems of inter-operating system loads occur with the native datatypes. Sometimes, it is possible to get around them by padding a field with zeros to lengthen it, or reading only part of the field to shorten it. (For example, when an 8-byte integer is to be read on a system that uses 6-byte integers, or vice versa.) Frequently, however, problems of incompatible byte-ordering, or incompatible implementations of the datatypes, make even this approach unworkable.
Without a SQL*Net link, it is a good idea to use only the CHAR, DATE, and NUMERIC EXTERNAL datatypes. Datafiles written in this manner are longer than those written with native datatypes. They take more time to load, but they transport most readily across operating systems. However, where incompatible byte-ordering is an issue, special filters may still be required to reorder the data.
SQL*Loader uses the SQL array-interface option to transfer data to the RDBMS. Multiple rows are read at one time and stored in the bind array. When SQL*Loader sends Oracle an INSERT command, the entire array is inserted at one time. After the rows in the bind array are inserted, a COMMIT is issued.
Although the entire bind array need not be in contiguous memory, the buffer for each field in the bind array must occupy contiguous memory. If the operating system cannot supply enough contiguous memory to store a field, SQL*Loader generates an error.
In general, any reasonably large size will permit SQL*Loader to operate effectively. It is not usually necessary to perform the detailed calculations described in this section. This section should be read when maximum performance is desired, or when an explanation of memory usage is needed.
As part of its initialization, SQL*Loader determines the space required to load a single row. If that size is too large to fit within the specified maximum, the load terminates with an error.
SQL*Loader then multiplies that size by the number of rows for the load, whether that value was specified with the command-line parameter ROWS (see page 6 - 3) or the OPTIONS clause in the control file (see page 5 - 12). If that size fits within the bind array maximum, the load continues--SQL*Loader does not try to expand the number of rows to reach the maximum bind array size. That is, if the number of rows and the maximum bind array size are both specified, SQL*Loader always uses the smaller value for the bind array.
If the maximum bind array size is too small to accommodate the initial number of rows, SQL*Loader uses a smaller number of rows that fits within the maximum.
bind array size = (number of rows) * (maximum row length)
where:
(maximum row length) = SUM(fixed field lengths) +
SUM(maximum varying field lengths) +
SUM(overhead for varying length fields)
Many fields do not vary in size. These fixed-length fields are the same for each loaded row. For those fields, the maximum length of the field is the field size, in bytes, as described in "Specifying Datatypes" . There is no overhead for these fields.
The fields that can vary in size from row to row are
VARCHAR VARGRAPHIC
CHAR DATE
numeric EXTERNAL
The maximum length of these datatypes is described in "Specifying Datatypes" . The maximum lengths describe the number of bytes, or character positions, that the fields can occupy in the input data record. That length also describes the amount of storage that each field occupies in the bind array, but the bind array includes additional overhead for fields that can vary in size.
When the character datatypes (CHAR, DATE, and numeric EXTERNAL) are specified with delimiters, any lengths specified for these fields are maximum lengths. When specified without delimiters, the size in the record is fixed, but the size of the inserted field may still vary, due to whitespace trimming. So internally, these datatypes are always treated as varying-length fields--even when they are fixed-length fields.
A length indicator is included for each of these fields in the bind array. The space reserved for the field in the bind array is large enough to hold the longest possible value of the field. The length indicator gives the actual length of the field for each row.
In summary:
bind array size =
(number of rows) * ( SUM(fixed field lengths)
+ SUM(maximum varying field lengths)
+ ( (number of varying length fields)
* (size of length-indicator) )
)
OPTIONS (ROWS=1)
LOAD DATA
INFILE *
APPEND
INTO TABLE DEPT
(deptno POSITION(1:1) CHAR)
BEGINDATA
a
This control file "loads" a one-character field using a one-row bind array. No data is actually loaded, due to the numeric conversion error that occurs when "a" is loaded as a number. The bind array size shown in the log file, minus one (the length of the character field) is the value of the length indicator.
Note: A similar technique can determine bind array size without doing any calculations. Run your control file without any data and with ROWS=1 to determine the memory requirements for a single row of data. Multiply by the number of rows you want in the bind array to get the bind array size.
Datatype | Size |
INTEGER | OS-dependent |
SMALLINT | |
FLOAT | |
DOUBLE | |
Datatype | Default Size | Length Specified with POSITION | Length Specified with DATATYPE |
GRAPHIC | None | L | 2*L |
GRAPHIC EXTERNAL | None | L - 2 | 2*(L-2) |
VARGRAPHIC | 4Kb*2 | L+S | (2*L)+S |
Datatype | Default Size | Maximum Length Specified (L) |
VARCHAR | 4Kb | L+S |
CHAR (delimited) DATE (delimited) numeric EXTERNAL (delimited) MLSLABEL (delimited) | 255 | L+S |
CHAR(10) TERMINATED BY ","
uses (10 + 2) * 64 = 768 bytes in the bind array, assuming that the length indicator is two bytes long. However:
CHAR TERMINATED BY ","
uses (255 + 2) * 64 = 16,448 bytes, because the default maximum size for a delimited field is 255. This can make a considerable difference in the number of rows that fit into the bind array.
If the same field in the data record is mentioned in multiple INTO TABLE clauses, it requires additional space in the bind array each time it is mentioned. So, it is especially important to minimize the buffer allocations for fields like these.
DEFAULTIF field_condition
A column may have both a NULLIF clause and a DEFAULTIF clause, although this often would be redundant.
Note: The same effects can be achieved with the SQL string and the DECODE function. See "Applying SQL Operators to Fields" .
NULLIF field_condition
The NULLIF clause may refer to the column that contains it, as in the following example:
COLUMN1 POSITION(11:17) CHAR NULLIF (COLUMN1 = "unknown")
This specification may be useful if you want certain data values to be replaced by nulls. The value for a column is first determined from the datafile. It is then set to null just before the insert takes place. Case 6 includes more examples of the NULLIF clause.
Note: The same effect can be achieved with the SQL string and the NVL function. See "Applying SQL Operators to Fields" .
If an all-blank CHAR field is surrounded by enclosure delimiters, then the blanks within the enclosures are loaded. Otherwise, the field is loaded as null. More details on whitespace trimming in character fields are presented in the following section.
Note: Specifying PRESERVE BLANKS changes this behavior. See "Preserving Whitespace" for more information.
loc POSITION(19:31)
loc CHAR(14)
In the second case, even though the field's exact position is not specified, the field's length is predetermined.
"__aa__"
where "__" represents blanks or tabs. Termination delimiters signal the end of a field, like the comma in:
__aa__,
Delimiters are specified with the control clauses TERMINATED BY and ENCLOSED BY, as shown in the following examples:
loc POSITION(19) TERMINATED BY ","
loc POSITION(19) ENCLOSED BY '"'
loc TERMINATED BY "." OPTIONALLY ENCLOSED BY '|'
loc POSITION(19:31) CHAR TERMINATED BY ","
and no comma is found between positions 19 and 31 of the input record, then the record is rejected. If a comma is found, then it delimits the field.
Figure 5 - 2. Relative positioning after a fixed field
If the previous field is terminated by a delimiter, then the next field begins immediately after the delimiter, as shown in Figure 5 - 3.
Figure 5 - 3. Relative positioning after a delimited field
When a field is specified both with enclosure delimiters and a termination delimiter, then the next field starts after the termination delimiter, as shown in Figure 5 - 4. If a non-whitespace character is found after the enclosure delimiter, but before the terminator, then SQL*Loader generates an error.
Figure 5 - 4. Relative positioning after enclosure delimiters
Figure 5 - 5. Fields terminated by whitespace
This situation occurs when the previous field is explicitly specified with the TERMINATED BY WHITESPACE clause, as shown in the example. It also occurs when you use the global FIELDS TERMINATED BY WHITESPACE clause.
Whenever optional enclosure delimiters are specified, SQL*Loader scans forward, looking for the first delimiter. If none is found, then the first non-whitespace character signals the start of the field. SQL*Loader skips over whitespace, eliminating it from the field. This situation is shown in Figure 5 - 6.
Figure 5 - 6. Fields terminated by optional enclosing delimiters
Unlike the case when the previous field is TERMINATED BY WHITESPACE, this specification removes leading whitespace even when a starting position is specified for the current field.
Note: If enclosure delimiters are present, leading whitespace after the initial enclosure delimiter is kept, but whitespace before this delimiter is discarded. See the first quote in FIELD1, Figure 5 - 6.
Specification | Data | Result | Leading Whitespace Present(1) | Trailing Whitespace Present(1) |
Predetermined Size | __aa__ | __aa | Y | N |
Terminated | __aa__, | __aa__ | Y | Y(2) |
Enclosed | "__aa__" | __aa__ | Y | Y |
Terminated and Enclosed | "__aa__", | __aa__ | Y | Y |
Optional Enclosure (present) | "__aa__", | __aa__ | Y | Y |
Optional Enclosure (absent) | __aa__, | aa__ | N | Y |
Previous Field Terminated by Whitespace | __aa__ | aa(3) | N | (3) |
(1) When an allow-blank field is trimmed, its value is null. | ||||
(2) Except for fields that are TERMINATED BY WHITESPACE | ||||
(3) Presence of trailing whitespace depends on the current field's specification, as shown by the other entries in the table. | ||||
For example, if the field
__aa__,
(where underscores represent blanks) is loaded with the following control clause:
TERMINATED BY ',' OPTIONALLY ENCLOSED BY '"'
then both the leading whitespace and the trailing whitespace are retained if PRESERVE BLANKS is specified. Otherwise, the leading whitespace is trimmed.
Note: The word BLANKS is not optional. Both words must be specified.
The SQL string must be enclosed in double quotation marks. It appears after any other specifications for a given column. It is evaluated after any NULLIF or DEFAULTIF clauses, but before a DATE mask. It may not be used on RECNUM, SEQUENCE, CONSTANT, or SYSDATE fields. If the RDBMS does not recognize the string, the load terminates in error. If the string is recognized, but causes a database error, the row that caused the error is rejected.
field1 POSITION(1:6) CHAR "LOWER(:field1)"
field1 CHAR TERMINATED BY ','
NULLIF ((1) = 'a') DEFAULTIF ((1)= 'b')
"RTRIM(:field1)"
field1 CHAR(7) "TRANSLATE(:field1, ':field1', ':1')"
In the last example, only the :field1 that is not in single quotes is interpreted as a column name. For more information on the use of quotes inside quoted strings, see "Specifying Filenames and Database Objects" .
Other fields in the same record can also be referenced, as in the following example:
field1 POSITION(1:4) INTEGER EXTERNAL
"decode(:field2, '22', '34', :field1)
field1 POSITION(1:9) DECIMAL EXTERNAL(8) ":field1/1000"
Truncating fields that could be too long:
field1 CHAR TERMINATED BY "," "SUBSTR(:field1, 1, 10)"
field1 POSITION(*+3) INTEGER EXTERNAL
"TRUNC(RPAD(:field1,6,'0'), -2)"
field1 POSITION(1:8) INTEGER EXTERNAL
"TRANSLATE(RTRIM(:field1),'N/A', '0')"
field1 CHARACTER(10)
"NVL( LTRIM(RTRIM(:field1)), 'unknown' )"
field1 DATE 'dd-mon-yy' "RTRIM(:field1)"
would be inserted as:
TO_DATE(RTRIM(<field1_value>), 'dd-mon-yyyy')
field1 ... "TO_CHAR(:field1, '$09999.99')"
could store numeric input data in formatted form, where field1 is a character column in the database. This field would be stored with the formatting characters (dollar sign, period, and so on) already in place.
You have even more flexibility, however, if you store such values as numeric quantities or dates. You can then apply arithmetic functions to the values in the database, and still select formatted values for your reports.
The SQL string is used in Case 7 (page 4 - 22) to load data from a formatted report.
![]() ![]() Prev Next |
![]() Copyright © 1996 Oracle Corporation. All Rights Reserved. |
![]() Library |
![]() Product |
![]() Contents |
![]() Index |