Oracle8 Concepts
Release 8.0

A58227-01

Library

Product

Contents

Index

Prev Next

22
Parallel Execution

Civilization advances by extending the number of important operations which we can perform without thinking about them.

Alfred North Whitehead: An Introduction to Mathematics

This chapter describes the parallel execution of SQL statements. The topics in this chapter include:

Overview of Parallel Execution

When Oracle is not parallelizing the execution of SQL statements, each SQL statement is executed sequentially by a single process. With parallel execution, however, multiple processes work together simultaneously to execute a single SQL statement. By dividing the work necessary to execute a statement among multiple processes, Oracle can execute the statement more quickly than if only a single process executed it.

Parallel execution can dramatically improve performance for data-intensive operations associated with decision support applications or very large database environments. Symmetric multiprocessing (SMP), clustered systems, and massively parallel systems (MPP) gain the largest performance benefits from parallel execution because statement processing can be split up among many CPUs on a single Oracle system.

Parallel execution helps systems scale in performance by making optimal use of hardware resources. If your system's CPUs and disk controllers are already heavily loaded, you need to alleviate the system's load or increase these hardware resources before using parallel execution to improve performance.

Additional Information:

See Oracle8 Tuning for specific information on tuning your parameter files and database to take full advantage of parallel execution.

 

Operations That Can Be Parallelized

The Oracle server can use parallel execution for any of these operations:

How Oracle Parallelizes Operations

A SELECT statement only consists of a query. A DML or DDL statement usually consists of a query portion and a DML or DDL portion. Oracle can parallelize both the query portion and the DML or DDL portion of the SQL statements listed in the previous section.


Note:

Although generally data manipulation language (DML) includes queries, in this chapter "DML" refers only to inserts, updates, and deletes.

 

Oracle primarily parallelizes SQL statements in the following ways:

  1. Parallelize by block ranges for scan operations (SELECTs and subqueries in DML and DDL statements).
  2. Parallelize by partitions for operations on partitioned tables and indexes.
  3. Parallelize by parallel server processes for inserts into nonpartitioned tables only.

Parallelizing by Block Range

Oracle parallelizes a query dynamically at execution time. Dynamic parallelism divides the table or index into ranges of database blocks (ROWID range) and executes the operation in parallel on different ranges. If the distribution or location of data changes, Oracle automatically adapts to optimize the parallelization for each execution of the query portion of a SQL statement.

Parallel scans by block range break the table or index into pieces delimited by high and low ROWID values. The table or index can be nonpartitioned or partitioned.

For partitioned tables and indexes, no ROWID range can span a partition although one partition can contain multiple ROWID ranges. Oracle sends the partition numbers with the ROWID ranges to avoid partition map lookup. Compile and run-time predicates on partitioning columns restrict the ROWID ranges to relevant partitions, eliminating unnecessary partition scans (partition pruning).

This means that a parallel query which accesses a partitioned table by a table scan performs the same or less overall work as the same query on a nonpartitioned table. The query on the partitioned table executes with equivalent parallelism, although the total number of disks accessed might be reduced by the partition pruning.

Oracle can parallelize the following operations on tables and indexes by block range (ROWID range):

Parallelizing by Partition

Partitions are a logical static division of tables and indexes which can be used to break some long-running operations into smaller operations executed in parallel on individual partitions. The granule of parallelism is a partition; there is no parallelism within a partition.

Operations on partitioned tables and indexes are performed in parallel by assigning different parallel server processes to different partitions of the table or index. Compile and run-time predicates restrict the partitions when the operation references partitioning columns (partition pruning). The operation executes serially when compile or run-time predicates restrict the operation to a single partition.

The parallel operation may use fewer parallel server processes than the number of accessed partitions (because of resource limits, hints, or table attributes), but each partition is accessed by a single parallel server process. A parallel server process, however, can access multiple partitions.

Operations on partitioned tables and indexes are performed in parallel only when more than one partition is accessed and when the selectivity of the table or index is such that more than a predetermined minimum number of table or index pages will be accessed.

Oracle can parallelize the following operations on partitioned tables and indexes by partition:

Parallelizing by Parallel Server Processes

For nonpartitioned tables only, Oracle parallelizes insert operations by dividing the work among parallel server processes. Since new rows do not have ROWIDs, the rows are distributed among the parallel server processes to insert them into the free space.

Process Architecture for Parallel Execution

When parallel execution is not being used, a single server process performs all necessary processing for the sequential execution of a SQL statement. For example, to perform a full table scan (such as SELECT * FROM EMP), one process performs the entire operation, as illustrated in Figure 22-1.

Figure 22-1 Serial Full Table Scan

Parallel execution performs thee operations in parallel using multiple parallel processes. One process, known as the parallel coordinator, dispatches the execution of a statement to several parallel server processes and coordinates the results from all of the server processes to send the results back to the user.


Note:

In this context, the phrase "parallel server process" does not mean a process of an Oracle Parallel Server, but instead means a process that performs an operation in parallel. (In an Oracle Parallel Server, however, the parallel server processes may be spread across multiple instances.) Parallel server processes are also sometimes called "slave processes".

 

When an operation is divided into pieces for parallel execution in a massively parallel processing (MPP) configuration, Oracle assigns a particular piece of the operation to a parallel server process by taking into account the affinity of the process for the piece of the table or index to be used for the operation. The physical layout of partitioned tables and indexes impacts on the affinity used to assign work for parallel server processes.

See "Affinity" on page 22-40 for more information.

Figure 22-2 illustrates several parallel server processes simultaneously performing a partial scan of the EMP table, which is divided by block range dynamically (dynamic partitioning). The parallel server processes send results back to the parallel coordinator process, which assembles the pieces into the desired full table scan.

Figure 22-2 Parallel Full Table Scan

The parallel coordinator breaks down execution functions into parallel pieces and then integrates the partial results produced by the parallel server processes. The number of parallel server processes assigned to a single operation is the degree of parallelism for an operation. Multiple operations within the same SQL statement all have the same degree of parallelism (see "Determining the Degree of Parallelism for Operations" on page 22-13).

The Parallel Server Pool

When an instance starts up, Oracle creates a pool of parallel server processes which are available for any parallel operation. The initialization parameter PARALLEL_MIN_SERVERS specifies the number of parallel server processes that Oracle creates at instance startup.

When executing a parallel operation, the parallel coordinator obtains parallel server processes from the pool and assigns them to the operation. If necessary, Oracle can create additional parallel server processes for the operation. These parallel server processes remain with the operation throughout job execution, then become available for other operations. After the statement has been processed completely, the parallel server processes return to the pool.


Note:

The parallel coordinator and the parallel server processes can only service one statement at a time. A parallel coordinator cannot coordinate, for example, a parallel query and a parallel DML statement at the same time.

You can set PARALLEL_MIN_SERVERS to a higher value if you need to run concurrent parallel statements.

 

When a user issues a SQL statement, the optimizer decides whether to execute the operations in parallel and determines the degree of parallelism for each operation. You can specify the number of parallel server processes required for an operation in various ways (see "Setting the Degree of Parallelism" on page 22-13).

If the optimizer targets the statement for parallel processing, the following sequence of events takes place:

The parallel coordinator calls upon the parallel server processes during the execution of the SQL statement (not during the parsing of the statement). Therefore, when parallel execution is used with the multithreaded server, the server process that processes the EXECUTE call of a user's statement becomes the coordinator process for the statement.

Variations in the Number of Parallel Server Processes

If the number of parallel operations processed concurrently by an instance changes significantly, Oracle automatically changes the number of parallel server processes in the pool.

If the number of parallel operations increases, Oracle creates additional parallel server processes to handle incoming requests. However, Oracle never creates more parallel server processes for an instance than what is specified by the initialization parameter PARALLEL_MAX_SERVERS.

If the number of parallel operations decreases, Oracle terminates any parallel server processes that have been idle for the period of time specified by the initialization parameter PARALLEL_SERVER_IDLE_TIME. Oracle does not reduce the size of the pool below the value of PARALLEL_MIN_SERVERS no matter how long the parallel server processes have been idle.

Processing Without Enough Parallel Server Processes

Oracle can process a parallel operation with fewer than the requested number of processes; see "Minimum Number of Parallel Server Processes" on page 22-15 for information about specifying a minimum with the initialization parameter PARALLEL_MIN_PERCENT.

If all parallel server processes in the pool are occupied and the maximum number of parallel server processes has been started, the parallel coordinator switches to serial processing.

Additional Information:

See Oracle8 Tuning for information about monitoring an instance's pool of parallel server processes and determining the appropriate values of the initialization parameters.

 

Parallelizing SQL Statements

Each SQL statement undergoes an optimization and parallelization process when it is parsed. Therefore, when the data changes, if a more optimal execution plan or parallelization plan becomes available, Oracle can automatically adapt to the new situation. (Optimization is discussed in Chapter 20, "The Optimizer".)

After the optimizer determines the execution plan of a statement, the parallel coordinator determines the parallelization method for each operation in the execution plan (for example, parallelize a full table scan by block range or parallelize an index range scan by partition). The coordinator must decide whether an operation can be performed in parallel and, if so, how many parallel server processes to enlist (that is, the degree of parallelism).

See "Setting the Degree of Parallelism" on page 22-13 and "Parallelization Rules for SQL Statements" on page 22-17 for more information.

Dividing Work Among Parallel Server Processes

The parallel coordinator examines the redistribution requirements of each operation. An operation's redistribution requirement is the way in which the rows operated on by the operation must be divided, or redistributed, among the parallel server processes.

After determining the redistribution requirement for each operation in the execution plan, the optimizer determines the order in which the operations in the execution plan must be performed. With this information, the optimizer determines the data flow of the statement.

Figure 22-3 illustrates the data flow of the following query:

SELECT dname, MAX(sal), AVG(sal) 
  FROM emp, dept 
    WHERE emp.deptno = dept.deptno 
    GROUP BY dname; 

Figure 22-3 Data Flow Diagram for a Join of the EMP and DEPT Tables

Parallelism Between Operations

Operations that require the output of other operations are known as parent operations. In Figure 22-3 the GROUP BY SORT operation is the parent of the MERGE JOIN operation because GROUP BY SORT requires the MERGE JOIN output.

Parent operations can begin consuming rows as soon as the child operations have produced rows. In the previous example, while the parallel server processes are producing rows in the FULL SCAN DEPT operation, another set of parallel server processes can begin to perform the MERGE JOIN operation to consume the rows.

Each of the two operations performed concurrently is given its own set of parallel server processes. Therefore, both query operations and the data flow tree itself have parallelism. The parallelism of an individual operation is called intra-operation parallelism and the parallelism between operations in a data flow tree is called inter-operation parallelism.

Due to the producer/consumer nature of the Oracle server's operations, only two operations in a given tree need to be performed simultaneously to minimize execution time.

To illustrate intra-operation parallelism and inter-operator parallelism, consider the following statement:

SELECT * FROM emp ORDER BY ename; 

The execution plan consists of a full scan of the EMP table followed by a sorting of the retrieved rows based on the value of the ENAME column. For the sake of this example, assume the ENAME column is not indexed. Also assume that the degree of parallelism for the query is set to four, which means that four parallel server processes can be active for any given operation.

Figure 22-4 illustrates the parallel execution of our example query.

Figure 22-4 Inter-Operation Parallelism and Dynamic Partitioning

As you can see from Figure 22-4, there are actually eight parallel server processes involved in the query even though the degree of parallelism is four. This is because a parent and child operator can be performed at the same time (inter-operation parallelism).

Also note that all of the parallel server processes involved in the scan operation send rows to the appropriate parallel server process performing the sort operation. If a row scanned by a parallel server process contains a value for the ENAME column between A and G, that row gets sent to the first ORDER BY parallel server process. When the scan operation is complete, the sorting processes can return the sorted results to the coordinator, which in turn returns the complete query results to the user.


Note:

When a set of parallel server processes completes its operation, it moves on to operations higher in the data flow. For example, in the previous diagram, if there was another ORDER BY operation after the ORDER BY, the parallel server processes performing the table scan perform the second ORDER BY operation after completing the table scan.

 

Setting the Degree of Parallelism

The parallel coordinator may enlist two or more of the instance's parallel server processes to process a SQL statement. The number of parallel server processes associated with a single operation is known as the degree of parallelism.

The degree of parallelism is specified at the statement level (with hints or the PARALLEL clause), at the table or index level (in the table's or index's definition), or by default based on the number of disks or CPUs.

Note that the degree of parallelism applies directly only to intra-operation parallelism. If inter-operation parallelism is possible, the total number of parallel server processes for a statement can be twice the specified degree of parallelism. No more than two operations can be performed simultaneously.

Determining the Degree of Parallelism for Operations

The parallel coordinator determines the degree of parallelism by considering several specifications. The coordinator first checks for hints or a PARALLEL clause specified in the SQL statement itself, then it looks at the table's or index's definition, and finally it checks for the default degree of parallelism (see "Default Degree of Parallelism" on page 22-14). Once a degree of parallelism is found in one of these specifications, it becomes the degree of parallelism for the operation.

For specific details of the degree of parallelism, see "Parallelization Rules for SQL Statements" on page 22-17.

Hints, PARALLEL clauses, table or index definitions, and default values only determine the number of parallel server processes that the coordinator requests for a given operation. The actual number of parallel server processes used depends upon how many processes are available in the parallel server pool (see "The Parallel Server Pool" on page 22-7) and whether inter-operation parallelism is possible (see "Parallelism Between Operations" on page 22-11).

Hints

You can specify hints in a SQL statement to set the degree of parallelism for a table or index and the caching behavior of the operation.

Table and Index Definitions

You can specify the degree of parallelism within a table or index definition. Use one of the following SQL statements to set the degree of parallelism for a table or index: CREATE TABLE, ALTER TABLE, CREATE INDEX, or ALTER INDEX.

Additional Information:

Refer to the Oracle8 SQL Reference for the complete syntax of SQL statements.

 

Default Degree of Parallelism

The default degree of parallelism is used when you do not specify a degree of parallelism in a hint or within the definition of a table or index. The default degree of parallelism is appropriate for most applications.

Additional Information:

See Oracle8 Tuning for information about adjusting the degree of parallelism.

 

The default degree of parallelism for a SQL statement is determined by the following factors.

  1. The number of CPUs in the system.
  2. The number of Oracle Parallel Server instances.
  3. The number of disks (or files, if affinity information is not available) that the table or index is stored on.
  4. For parallelizing by partition, the number of partitions that will be accessed, based upon partition pruning (if approximate).
  5. For parallel DML operations with global index maintenance, the minimum number of transaction free lists among all the global indexes to be updated. The minimum number of transaction free lists for a partitioned global index is the minimum number across all index partitions. This is a requirement in order to prevent self-deadlock.

For example, if your system has 20 CPUs and you issue a parallel query on a table that is stored on 15 disk drives, then the default degree of parallelism for your query is 15 query servers.


Note:

Oracle obtains the information about disks and CPUs from the operating system.

 

The above factors determine the default number of parallel server processes to use, however, the actual number of processes used is limited by their availability on the requested instances during run time. The initialization parameter PARALLEL_MAX_SERVERS sets an upper limit on the total number of parallel server processes that an instance can have.

If a minimum fraction of the desired parallel server processes is not available (specified by the initialization parameter PARALLEL_MIN_PERCENT), a user error is produced. The user can then retry the query with less parallelism.


Note:

The PARALLEL_DEFAULT_SCANSIZE and PARALLEL_DEFAULT_MAX_SCANS initialization parameters are obsolete.

 

Minimum Number of Parallel Server Processes

Oracle can perform an operation in parallel as long as at least two parallel server processes are available. If too few parallel server processes are available, your SQL statement may execute slower than expected. You can specify that a minimum percentage of requested parallel server processes must be available in order for the operation to execute. This ensures that your SQL statement executes with a minimum acceptable parallel performance. If the minimum percentage of requested parallel server processes are not available, the SQL statement does not execute and returns an error.

The initialization parameter PARALLEL_MIN_PERCENT specifies the desired minimum percentage of requested parallel server processes. This parameter affects DML and DDL operations as well as queries.

For example, if you specify 50 for this parameter, then at least 50% of the parallel server processes requested for any parallel operation must be available in order for the operation to succeed. If 20 parallel server processes are requested, then at least 10 must be available or an error is returned to the user. If PARALLEL_MIN_PERCENT is set to null, then all parallel operations will proceed as long as at least two parallel server processes are available for processing.

Limiting the Number of Available Instances

In an Oracle Parallel Server, instance groups can be used to limit the number of instances that participate in a parallel operation. You can create any number of instance groups, each consisting of one or more instances. You can then specify which instance group is to be used for any or all parallel operations. Parallel server processes will only be used on instances which are members of the specified instance group.

Additional Information:

See Oracle8 Parallel Server Concepts and Administration for more information about instance groups.

 

Balancing the Work Load

To optimize performance, all parallel server processes should have equal work loads. For SQL statements parallelized by block range or by parallel server processes, the work load is dynamically divided among the parallel server processes. This minimizes workload skewing, which occurs when some parallel server processes perform significantly more work than the other processes.

For SQL statements parallelized by partitions, if the work load is evenly distributed among the partitions then you can optimize performance by matching the number of parallel server processes to the number of partitions, or by choosing a degree of parallelism such that the number of partitions is a multiple of the number of processes.

For example, if a table has ten partitions and a parallel operation divides the work evenly among them, you can use ten parallel server processes (degree of parallelism = 10) to do the work in approximately one-tenth the time that one process would take, or you can use five processes to do the work in one-fifth the time, or two processes to do the work in one-half the time.

If, however, you use nine processes to work on ten partitions, the first process to finish its work on one partition then begins work on the tenth partition; and as the other processes finish their work they become idle. This does not give good performance when the work is evenly divided among partitions. When the work is unevenly divided, the performance varies depending on whether the partition that is left for last has more or less work than the other partitions.

Similarly, if you use four processes to work on ten partitions and the work is evenly divided, then each process works on a second partition after finishing its first partition, but only two of the processes work on a third partition while the other two remain idle.

In general, you cannot assume that the time taken to perform a parallel operation on N partitions with P parallel server processes will be N/P, because of the possibility that some processes might have to wait while others finish working on the last partition(s). By choosing an appropriate degree of parallelism, however, you can minimize the workload skewing and optimize performance.

For information about balancing the work load with disk affinity, see "Affinity and Parallel DML" on page 22-41.

Parallelization Rules for SQL Statements

A SQL statement can be parallelized if it includes a parallel hint or if the table or index being operated on has been declared PARALLEL with a CREATE or ALTER statement. In addition, a data definition language (DDL) statement can be parallelized by using the PARALLEL clause. However, not all of these methods apply to all types of SQL statements.

Parallelization has two components: the decision to parallelize and the degree of parallelism. These components are determined differently for queries, DDL operations, and DML operations.

To determine the degree of parallelism, Oracle looks at the reference objects.

Rules for Parallelizing Queries

Decision to Parallelize

A SELECT statement can be parallelized only if the following conditions are satisfied:

  1. The query includes a "parallel" hint specification (PARALLEL or PARALLEL_INDEX) or the schema objects referred to in the query have a PARALLEL declaration associated with them.
  2. At least one of the tables specified in the query requires one of the following:
    • a full table scan
    • an index range scan spanning multiple partitions
Degree of Parallelism

The degree of parallelism for a query is determined by the following rules:

  1. The query uses the maximum degree of parallelism taken from all of the table declarations involved in the query and all of the potential indexes that are candidates to satisfy the query (the reference objects). That is, the table or index that has the greatest degree of parallelism determines the query's degree of parallelism (maximum query directive).
  2. If a table has both a "parallel" hint specification in the query and a parallel declaration in its table specification, the hint specification takes precedence over parallel declaration specification.

Rules for Parallelizing UPDATE and DELETE

Update and delete operations are parallelized by partition. Updates and deletes can only be parallelized on partitioned tables; update/delete parallelism is not possible within a partition, nor on a nonpartitioned table.

You have two ways to specify parallel directives for UPDATE and DELETE operations (assuming that PARALLEL DML mode is enabled):

  1. Parallel clause specified in the definition of the table being updated or deleted (the reference object).
  2. Update or delete parallel hint specified at the statement.

Parallel hints are placed immediately after the UPDATE or DELETE keywords in UPDATE and DELETE statements. The hint also applies to the underlying scan of the table being changed.

Parallel clauses in CREATE TABLE and ALTER TABLE commands specify table parallelism. If a parallel clause exists in a table definition, it determines the parallelism of DML statements as well as queries. If the DML statement contains explicit parallel hints for a table, however, then those hints override the effect of parallel clauses for that table.

Decision to Parallelize

The following rule determines whether the update/delete operation should be parallelized in an update/delete statement:

If the statement contains subqueries or updatable views, they may have their own separate parallel hints or clauses, but these parallel directives do not affect the decision to parallelize the update or delete.

Although the parallel hint or clause on the tables is used by both query and update/delete portions to determine parallelism, the decision to parallelize the update/delete portion is made independently of the query portion, and vice versa.

Degree of Parallelism

The degree of parallelism is determined by the same rules as for the queries. Note that in the case of update and delete operations, only one table (the only reference object) is involved which is the target table to be modified.

The precedence rule to determine the degree of parallelism for the update/delete operation is that the update or delete parallel hint specification takes precedence over the parallel declaration specification of the target table:

Update/Delete hint > Parallel declaration specification of targe table

The maximum degree of parallelism you can achieve is equal to the number of partitions in the table. A parallel server process can update into or delete from multiple partitions, but each partition can only be updated or deleted by one parallel server process.

If the degree of parallelism is less than the number of partitions, then the first process to finish work on one partition continues working on another partition, and so on until the work is finished on all partitions. If the degree of parallelism is greater than the number of partitions involved in the operation, then the excess parallel server processes would have no work to do.

Example 1:
UPDATE tbl_1 SET c1=c1+1 WHERE c1>100; 

If TBL_1 is a partitioned table and its table definition has a parallel clause, then the update operation will be parallelized even if the scan on the table is serial (such as an index scan), assuming that the table has more than one partition with C1 greater than 100.

Example 2:
UPDATE /*+ PARALLEL(tbl_2,4) */ tbl_2 SET c1=c1+1; 

Both the scan and update operations on TBL_2 will be parallelized with degree 4.

Rules for Parallelizing INSERT ... SELECT

An INSERT ... SELECT statement parallelizes its INSERT and SELECT operations independently, except for the degree of parallelism.

You can specify a "parallel" hint after the INSERT keyword in an INSERT ... SELECT statement. Since the tables being queried are usually not the same as the table being inserted into, the hint allows you to specify parallel directives specifically for the insert operation.

You have four ways to specify parallel directives for an INSERT... SELECT statement (assuming that PARALLEL DML mode is enabled):

  1. SELECT parallel hint(s) specified at the statement.
  2. Parallel clause(s) specified in the definition of tables being selected.
  3. INSERT parallel hint specified at the statement.
  4. Parallel clause specified in the definition of tables being inserted into.
Decision to Parallelize

The following rule determines whether the insert operation should be parallelized in an INSERT... SELECT statement:

Hence the decision to parallelize the insert operation is made independently of the select operation, and vice versa.

Degree of Parallelism

Once the decision to parallelize the select and/or insert operation is made, one parallel directive is picked for deciding degree of parallelism of the whole statement using the following precedence rule:

Insert Hint directive > Parallel declaration specification of the inserting table > Maximum Query directive

where Maximum Query directive means that among multiple tables and indexes, the table or index that has the maximum degree of parallelism determines the parallelism for the query operation.

The chosen parallel directive is applied to both the select and insert operations.

Example:

In the following example, the degree of parallelism used will be 2, which is the degree specified in the Insert hint:

INSERT /*+ PARALLEL(tbl_ins,2) */ INTO tbl_ins 
  SELECT /*+ PARALLEL(tbl_sel,4) */ * FROM tbl_sel; 

Rules for Parallelizing DDL Statements

Decision to Parallelize

DDL operations can be parallelized if a PARALLEL clause (declaration) is specified in the syntax. In the case of CREATE INDEX and ALTER INDEX ... REBUILD or ALTER INDEX ... REBUILD PARTITION, the parallel declaration is stored in the data dictionary.

Degree of Parallelism

The degree of parallelism is determined by the specification in the PARALLEL clause. A rebuild of a partitioned index is never parallelized.

Rules for Parallelizing Create Index, Rebuild Index, Merge/Split Partition

Parallel CREATE INDEX or ALTER INDEX ... REBUILD

The CREATE INDEX and ALTER INDEX ... REBUILD statements can be parallelized only by a PARALLEL clause.

ALTER INDEX ... REBUILD can be parallelized only for a nonpartitioned index, but ALTER INDEX ... REBUILD PARTITION can be parallelized by a PARALLEL clause.

The scan operation for ALTER INDEX ... REBUILD (nonpartitioned), ALTER INDEX ... REBUILD PARTITION, and CREATE INDEX has the same parallelism as the REBUILD or CREATE operation and uses the same degree of parallelism. If the degree of parallelism is not specified for REBUILD or CREATE, the default is the number of CPUs.

Parallel MOVE PARTITION or SPLIT PARTITION

The ALTER INDEX ... MOVE PARTITION and ALTER INDEX ... SPLIT PARTITION statements can be parallelized only by a PARALLEL clause. Their scan operations have the same parallelism as the corresponding MOVE/SPLIT operations. If the degree of parallelism is not specified, the default is the number of CPUs.

Rules for Parellelizing Create Table as Select

The CREATE TABLE ... AS SELECT statement contains two parts:

Oracle can parallelize both parts of the statement. The CREATE part follows the same rules as other DDL operations.

Decision to Parallelize (Query Part)

The query part of a CREATE TABLE ... AS SELECT statement can be parallelized only if the following conditions are satisfied:

  1. The query includes a "parallel" hint specification (PARALLEL or PARALLEL_INDEX) or the CREATE part of the statement has a PARALLEL clause specification or the schema objects referred to in the query have a PARALLEL declaration associated with them.
  2. At least one of the tables specified in the query requires one of the following:
    • a full table scan
    • an index range scan spanning multiple partitions
Degree of Parallelism (Query Part)

The degree of parallelism for the query part of a CREATE TABLE ... AS SELECT statement is determined by one of the following rules:

  1. The query part uses the values specified in the PARALLEL clause of the CREATE part.
  2. If the PARALLEL clause is not specified, the default degree of parallelism is the number of CPUs.

Note that any values specified in a hint for parallelism will be ignored.

Decision to Parallelize (Create Part)

The CREATE operation of CREATE TABLE ... AS SELECT can be parallelized only by a PARALLEL clause.

When the CREATE operation of CREATE TABLE ... AS SELECT is parallelized, Oracle also parallelizes the scan operation if possible. The scan operation cannot be parallelized if, for example:

When the CREATE operation is not parallelized, the SELECT can be parallelized if it has a PARALLEL hint or if the selected table (or partitioned index) has a parallel declaration.

Degree of Parallelism (Create Part)

The degree of parallelism for the CREATE operation, and for the SELECT operation if it is parallelized, is specified by the PARALLEL clause of the CREATE statement. If the CREATE statement does not specify the degree of parallelism, the default is the number of CPUs. Note that any degree of parallelism specified in a hint for the SELECT clause is ignored.

Summary of Parallelization Rules

Table 22-1 shows how various types of SQL statements can be parallelized, and indicates which methods of specifying parallelism take precedence.

Parallel DDL

This section includes the following topics on parallelism for data definiation language (DDL) statements:

DDL Statements That Can Be Parallelized

You can parallelize DDL statements for tables and indexes that are nonpartitioned or partitioned. Table 22-1 summarizes the operations that can be parallelized in DDL statements.

The parallel DDL statements for nonpartitioned tables and indexes are:

The parallel DDL statements for partitioned tables and indexes are:

All of these DDL operations can be performed in no-logging mode (see "Logging Mode" on page 21-5) for either parallel or serial execution.

Different parallelism is used for different operations (see Table 22-1). Parallel create (partitioned) table as select and parallel create (partitioned) index execute with a degree of parallelism equal to the number of partitions.

Partition parallel analyze table is made less necessary by the ANALYZE {TABLE, INDEX} PARTITION commands, since parallel analyze of an entire partitioned table can be constructed with multiple user sessions.

Additional Information:

See Oracle8 SQL Reference for information about the syntax and use of parallel DDL statements.

 

CREATE TABLE ... AS SELECT in Parallel

Decision support applications, for performance reasons, often require large amounts of data to be summarized or "rolled up" into smaller tables for use with ad hoc, decision support queries. Rollup occurs regularly (such as nightly or weekly) during a short period of system inactivity.

Parallel execution allows you to parallelize the query and create operations of creating a table as a subquery from another table or set of tables.

Figure 22-5 illustrates creating a table from a subquery in parallel.

Figure 22-5 Creating a Summary Table in Parallel


Note:

Clustered tables cannot be created and populated in parallel.

 

Recoverability and Parallel DDL

When summary table data is derived from other tables' data, the recoverability from media failure for the smaller summary table may not be important and can be turned off during creation of the summary table.

If you disable logging during parallel table creation (or any other parallel DDL operation), you should take a backup of the tablespace containing the table once the table is created to avoid loss of the table due to media failure.

Use the NOLOGGING clause of CREATE/ALTER TABLE/INDEX statements to disable undo and redo log generation. See "Logging Mode" on page 21-5 for more information.

Additional Information:

See the Oracle8 Administrator's Guide for information about recoverability of tables created in parallel.

 

Space Management for Parallel DDL

Creating a table or index in parallel has space management implications that affect both the storage space required during the parallel operation and the free space available after the table or index has been created.

Storage Space for CREATE TABLE ... AS SELECT and CREATE INDEX

When creating a table or index in parallel, each parallel server process uses the values in the STORAGE clause of the CREATE statement to create temporary segments to store the rows. Therefore, a table created with an INITIAL of 5M and a PARALLEL DEGREE of 12 consumes at least 60 megabytes of storage during table creation, because each process starts with an extent of 5 megabytes. When the parallel coordinator combines the segments, some of the segments may be trimmed, and the resulting table may be smaller than the requested 60 megabytes.

Additional Information:

See the Oracle8 SQL Reference for a discussion of the syntax of the CREATE TABLE command.

 

Free Space and Parallel DDL

When you create indexes and tables in parallel, each parallel server process allocates a new extent and fills the extent with the table or index's data. Thus, if you create an index with a degree of parallelism of 3, there will be at least three extents for that index initially. (This discussion also applies to rebuilding indexes in parallel and moving, splitting, or rebuilding partitions in parallel.)

Serial operations require the schema object to have at least one extent. Parallel creations require that tables or indexes have at least as many extents as there are parallel server processes creating the schema object.

When you create a table or index in parallel, it is possible to create "pockets" of free space - either external or internal fragmentation. This occurs when the temporary segments used by the parallel server processes are larger than what is needed to store the rows.

For example, if you specify a degree of parallelism of three for a CREATE TABLE ... AS SELECT statement but there is only one datafile in the tablespace, the internal fragmentation illustrated in Figure 22-6 can arise. The "pockets" of free space within internal table extents of a datafile cannot be coalesced with other free space and allocated as extents.

For more information about datafiles and tablespaces, see Chapter 3, "Tablespaces and Datafiles".

Additional Information:

See Oracle8 Tuning for more information about creating tables and indexes in parallel.

 

Figure 22-6 Unusable Free Space (Internal Fragmentation)

Parallel DML

Parallel DML (parallel insert, update, and delete) uses parallel execution mechanisms to speed up or scale up large DML operations against large database tables and indexes.


Note:

Although generally data manipulation language (DML) includes queries, in this chapter the term "DML" refers only to inserts, updates, and deletes.

 

This section discusses the following parallel DML topics:

See Chapter 21, "Direct-Load INSERT" for a detailed description of parallel insert statements.

Advantages of Parallel DML over Manual Parallelism

You can parallelize DML operations manually by issuing multiple DML commands simultaneously against different sets of data. For example, you can parallelize manually by:

However, manual parallelism has the following disadvantages:

Parallel DML removes these disadvantages by performing inserts, updates, and deletes in parallel automatically.

When to Use Parallel DML

Parallel DML operations are mainly used to speed up large DML operations against large database objects. Parallel DML is useful in a decision support system (DSS) environment where the performance and scalability of accessing large objects are important. Parallel DML complements parallel query in providing you with both querying and updating capabilities for your DSS databases.

The overhead of setting up parallelism makes parallel DML operations infeasible for short OLTP transactions. However, parallel DML operations can speed up batch jobs running in an OLTP database.

Refresh Tables of a Data Warehouse System

In a data warehouse system, large tables need to be refreshed (updated) periodically with new or modified data from the production system. You can do this efficiently by using parallel DML combined with updatable join views.

The data that needs to be refreshed is generally loaded into a temporary table before starting the refresh process. This table contains either new rows or rows that have been updated since the last refresh of the data warehouse. You can use an updatable join view with parallel UPDATE to refresh the updated rows, and you can use an anti-hash join with parallel INSERT to refresh the new rows.

Additional Information:

For details, see Oracle8 Tuning.

 

Intermediate Summary Tables

In a DSS environment, many applications require complex computations that involve constructing and manipulating many large intermediate summary tables. These summary tables are often temporary and frequently do not need to be logged. Parallel DML can speed up the operations against these large intermediate tables. One benefit is that you can put incremental results in the intermediate tables and perform parallel updates.

In addition, the summary tables may contain cumulative or comparison information which has to persist beyond application sessions; thus, temporary tables are not feasible. Parallel DML operations can speed up the changes to these large summary tables.

Scoring Tables

Many DSS applications score customers periodically based on a set of criteria. The scores are usually stored in large DSS tables. The score information is then used in making a decision, for example, inclusion in a mailing list.

This scoring activity queries and updates a large number of rows in the large table. Parallel DML can speed up the operations against these large tables.

Historical Tables

Historical tables describe the business transactions of an enterprise over a recent time interval. Periodically, the DBA deletes the set of oldest rows and inserts a set of new rows into the table. Parallel INSERT... SELECT and parallel DELETE operations can speed up this rollover task.

Although you can also use parallel direct loader (SQL*Loader) to insert bulk data from an external source, parallel INSERT... SELECT will be faster in inserting data that already exists in another table in the database.

Dropping a partition can also be used to delete old rows, but to do this, the table has to be partitioned by date and with the appropriate time interval.

Batch Jobs

Batch jobs executed in an OLTP database during off hours have a fixed time window in which the jobs must complete. A good way to ensure timely job completion is to parallelize their operations. As the work load increases, more machine resources can be added; the scaleup property of parallel operations ensures that the time constraint can be met.

Enabling Parallel DML

A DML statement can be parallelized only if you have explicitly enabled parallel DML in the session via the ENABLE PARALLEL DML option of the ALTER SESSION statement. This mode is required because parallel DML and serial DML have different locking, transaction, and disk space requirements. (See "Space Considerations for Parallel DML" on page 22-35 and "Lock and Enqueue Resources for Parallel DML" on page 22-36.)

The default mode of a session is DISABLE PARALLEL DML. When PARALLEL DML is disabled, no DML will be executed in parallel even if the PARALLEL hint or PARALLEL clause is used.

When PARALLEL DML is enabled in a session, all DML statements in this session will be considered for parallel execution. However, even if the PARALLEL DML is enabled, the DML operation may still execute serially if there are no parallel hints or parallel clauses or if restrictions on parallel operations are violated (see "Restrictions on Parallel DML" on page 22-37).

The session's PARALLEL DML mode does not influence the parallelism of SELECT statements, DDL statements, and the query portions of DML statements. Thus, if this mode is not set, the DML operation is not parallelized but scans or join operations within the DML statement may still be parallelized.

Transactions with PARALLEL DML Enabled

A session that is enabled for PARALLEL DML may put transactions in the session in a special mode: If any DML statement in a transaction modifies a table in parallel, no subsequent serial or parallel query or DML statement can access the same table again in that transaction. This means that the results of parallel modifications cannot be seen during the transaction.

Serial or parallel statements that attempt to access a table which has already been modified in parallel within the same transaction are rejected with an error message.

If a PL/SQL procedure or block is executed in a PARALLEL DML enabled session, then this rule applies to statements in the procedure or block.

Transaction Model for Parallel DML

To execute a DML operation in parallel, the coordinator process acquires or spawns parallel server processes and each parallel server process executes a portion of the work under its own parallel process transaction.

The coordinator also has its own coordinator transaction, which can have its own rollback segment.

Rollback Segments

Oracle assigns transactions to rollback segments that have the fewest active transactions. To speed up both forward and undo operations, you should create and bring online enough rollback segments so that at most two parallel process transactions are assigned to one rollback segment.

Create the rollback segments in tablespaces that have enough space for them to extend when necessary and set the MAXEXTENTS storage parameters for the rollback segments to UNLIMITED.

Two-Phase Commit

A parallel DML operation is executed by more than one independent parallel process transaction. In order to ensure user-level transactional atomicity, the coordinator uses a two-phase commit protocol to commit the changes performed by the parallel process transactions.

This two-phase commit protocol is a simplified version which makes use of shared disk architecture to speed up transaction status lookups, especially during transactional recovery. It does not require the Oracle XA library. In-doubt transactions never become visible to users.

Recovery for Parallel DML

The time required to roll back a parallel DML operation is roughly equal to the time it took to perform the forward operation.

Oracle supports parallel transaction recovery ("undo" recovery) during transaction and process failures, and to a lesser extent during instance and system failures.

To speed up transaction recovery, the initialization parameter CLEANUP_ROLLBACK_ENTRIES should be set to a high value approximately equal to the number of rollback entries generated for the forward-going operation.

Transaction Recovery

A user-issued rollback in a transaction failure due to statement error is performed in parallel by the parallel coordinator and the parallel server processes. The rollback takes approximately the same amount of time as the forward transaction.

Process Recovery

Recovery from the failure of a parallel DML coordinator or parallel server process is performed by the PMON process.

The recovery time for process failures can therefore be longer than the original (forward) work.

System Recovery

Recovery from a system failure needs a new startup. Recovery is performed by the SMON process. Parallel DML statements recover serially and all resources remain locked until recovery is complete. Recovery can therefore take much longer than the original (forward) transaction if the forward transaction used a high degree of parallelism and has done a lot of work.

One way to speed up transaction recovery is to rerun the parallel DML statement. When the new coordinator and parallel server processes encounter the locked resources, they trigger transaction recovery concurrently. After the new parallel processes finish recovering the resources, you can either commit or roll back the transaction.

Instance Recovery (Oracle Parallel Server)

Recovery from an instance failure in an Oracle Parallel Server is performed by the SMON processes of other live instances. Each SMON process of the live instances can recover the parallel coordinator and/or parallel server process transactions of the failed instance independently. If there are more parallel server processes in the failed instance than there are live instances, the recovery time is longer than the forward work by the failed instance.

Space Considerations for Parallel DML

Parallel UPDATE uses the space in the existing object, as opposed to direct-load INSERT which gets new segments for the data.

Space usage characteristics may be different in parallel than they would be if the statement executed sequentially, because multiple concurrent child transactions modify the object.

See "Space Considerations" on page 21-8 for information about space for direct-load INSERT.

Lock and Enqueue Resources for Parallel DML

A parallel DML operation's lock and enqueue resource requirements are very different from the serial DML requirements. Parallel DML holds many more locks, so you should increase the value of the ENQUEUE_RESOURCES and DML_LOCKS parameters.

The processes for a parallel UPDATE, DELETE, or INSERT statement acquire the following locks:

A parallel server process can work on one or more partitions, but a partition can only be worked on by one parallel server process.

For example, for a table with 600 partitions running with parallel degree 100, a parallel DML statement needs the following locks (assuming all partitions are involved in the statement):

Table 22-2 summarizes the types of locks acquired by coordinator and parallel server processes for different types of parallel DML statements.

Table 22-2 Locks Acquired by Parallel DML Statements
Type of statement   Coordinator process acquires:   Each parallel server process acquires:  

Parallel UPDATE or DELETE into partitioned table; WHERE clause specifies the partition  

1 table lock SX

1 partition lock X per partition  

1 table lock SX

1 partition lock NULL per partition

1 partition-wait lock X per partition  

Parallel UPDATE, DELETE, or INSERT into partitioned table  

1 table lock SX

Partition locks X for all partitions  

1 table lock SX

1 partition lock NULL per partition

1 partition-wait lock X per partition  

Parallel INSERT into nonpartitioned table  

1 table lock X  

None  

Restrictions on Parallel DML

The following restrictions apply to parallel DML (including direct-load INSERT):

Violations will cause the statement to execute serially without warnings or error messages (except for the restriction on statements accessing the same table in a transaction, which can cause error messages). For example, an update will be serialized if it requires global unique index maintenance.

The following sections give further details about restrictions.

Partitioning Key Restriction

You can only update the partitioning key of a partitioned table to a new value if the update would not cause the row to move to a new partition. This is a general restriction on partitioned tables.

Data Integrity Restrictions

This section describes the interactions of integrity constraints and parallel DML statements.

NOT NULL and CHECK

These types of integrity constraints are allowed. They are not a problem for parallel DML because they are enforced on the column and row level, respectively.

UNIQUE and PRIMARY KEY

Both of these constraints are enforced with unique indexes. An UPDATE command that modifies a unique or primary key index is parallelized only if the index is local.

FOREIGN KEY (Referential Integrity)

There are restrictions for referential integrity whenever a DML operation on one table could cause a recursive DML operation on another table or, in order to perform the integrity check, it would be necessary to see simultaneously all changes made to the object being modified.

Table 22-3 lists all of the operations that are possible on tables that are involved in referential integrity constraints.

Table 22-3 Referential Integrity Restrictions
DML Statement   Issued on Parent   Issued on Child   Self-Referential  

INSERT  

(Not applicable)  

Not parallelized  

Not parallelized  

UPDATE No Action  

Supported  

Supported  

Not parallelized  

DELETE No Action  

Supported  

Supported  

Not parallelized  

DELETE Cascade  

Not parallelized  

(Not applicable)  

Not parallelized  

Delete Cascade

Delete on tables having a foreign key with delete cascade is not parallelized because parallel server processes will try to delete rows from multiple partitions (parent and child tables).

Self-Referential Integrity

DML on tables with self-referential integrity constraints is not parallelized if the referenced keys (primary keys) are involved. For DML on all other columns, parallelism is possible.

Deferrable Integrity Constraints

If there are any deferrable constraints on the table being operated on, the DML operation will not be parallelized.

Trigger Restrictions

A DML operation will not be parallelized if any triggers are enabled on the affected tables that may get fired as a result of the statement. This implies that DML statements on tables that are being replicated will not be parallelized.

Relevant triggers must be disabled in order to parallelize DML on the table. Note that enabling/disabling triggers invalidates dependent shared cursors.

Function Restrictions

Only functions that don't read or write database or package state are allowed in parallel DML statements. A DML operation will not be parallelized if the DML statement has embedded functions that either read or write database state or package state.

Additional Information:

See the description of the pragma RESTRICT_REFERENCES in the Oracle8 Application Developer's Guide.

 

Distributed Transaction Restrictions

A DML operation cannot be parallelized if it is in a distributed transaction or if the DML or the query operation is against a remote object.

Example 1:

In a distributed transaction:

select * from t1@dblink; /* this starts a distributed transaction */ 
delete /*+ parallel (t2,2) */ from t2; /* not parallelized */ 
commit; 

Example 2:

DML operation on a remote object:

delete /*+ parallel *t1, 2) */ from t1@dblink; 
/* cannot parallel delete from remote object */ 

Example 3:

DML statement which queries a remote object:

insert /* append parallel (t3,2) */ into t3 select * from t4@dblink; 
/* not parallelized because of reference to remote object */ 

Affinity

In a shared-disk cluster or massively parallel processing (MPP) configuration, an instance of the Oracle Parallel Server is said to have affinity for a device if the device is directly accessed from the processor(s) on which the instance is running. Similarly, an instance has affinity for a file if it has affinity for the device(s) that the file is stored on.


Attention:

The features described in this section are available only if you have purchased Oracle8 Enterprise Edition with the Parallel Server Option. See Getting to Know Oracle8 and the Oracle8 Enterprise Edition for information about the features and options available with Oracle8 Enterprise Edition.

 

Determination of affinity may involve arbitrary determinations for files that are striped across multiple devices. Somewhat arbitrarily, an instance is said to have affinity for a tablespace (or a partition of a table or index within a tablespace) if the instance has affinity for the first file in the tablespace.

Oracle considers affinity when allocating work to parallel server processes. The use of affinity for parallel execution of SQL statements is transparent to users.

Affinity and Parallel Queries

Affinity in parallel queries increases the speed of scanning data from disk by doing the scans on a processor that is "near" the data. This can provide a substantial performance increase for machines that do not naturally support shared disks.

The most common use of affinity is for a table or index partition to be stored in one file on one device. This configuration provides the highest availability by limiting the damage done by a device failure and makes best use of partition-parallel index scans.

DSS customers might prefer to stripe table partitions over multiple devices (probably a subset of the total number of devices). This allows some queries to prune the total amount of data being accessed using partitioning criteria and still obtain parallelism through ROWID-range parallel table (partition) scans. If the devices are configured as a RAID, availability can still be very good. Even when used for DSS, indexes should probably be partitioned on individual devices.

Other configurations (for example, multiple partitions in one file striped over multiple devices) will yield correct query results, but you may need to use hints or explicitly set object attributes to select the correct degree of parallelism.

Affinity and Parallel DML

For parallel DML (inserts, updates, and deletes), affinity enhancements improve cache performance by routing the DML operation to the node that has affinity for the partition.

Affinity determines how to distribute the work among the set of instances and/or parallel server processes to perform the DML operation in parallel. Affinity can improve performance of queries in several ways:

  1. For certain MPP architectures, Oracle uses device-to-node affinity information to determine on which nodes to spawn parallel server processes (parallel process allocation) and which work granules (ROWID ranges or partitions) to send to particular nodes (work assignment). Better performance is achieved by having nodes mainly access local devices, giving a better buffer cache hit ratio for every node and reducing the network overhead and I/O latency.
  2. For SMP shared disk clusters, Oracle uses a round-robin mechanism to assign devices to nodes. Similar to item 1, this device-to-node affinity is used in determining parallel process allocation and work assignment.
  3. For SMP, cluster, and MPP architectures, process-to-device affinity is used to achieve device isolation. This reduces the chances of having multiple parallel server processes accessing the same device simultaneously. This process-to-device affinity information is also used in implementing stealing between processes.

For partitioned tables and indexes, partition-to-node affinity information determines process allocation and work assignment. For shared-nothing MPP systems, the Oracle Parallel Server tries to assign partitions to instances taking the disk affinity of the partitions into account. For shared-disk MPP and cluster systems, partitions are assigned to instances in a round-robin manner.

Affinity is only available for parallel DML when running in an Oracle Parallel Server configuration. Affinity information which persists across statements will improve buffer cache hit ratios and reduce block pings between instances.

Additional Informaion:

See Oracle8 Parallel Server Concepts and Administration for more information about the Oracle Parallel Server.

 

Other Types of Parallelism

In addition to parallel SQL execution, Oracle can use parallelism for the following types of operations:

Like parallel SQL, parallel recovery and parallel propagation are executed by a parallel coordinator process and multiple parallel server processes. Parallel load, however, uses a different mechanism.

Additional Information:

See Oracle8 Utilities for information about parallel load and general information about SQL*Loader. Also see Oracle8 Tuning for advice about using parallel load.

 

The behavior of the parallel coordinator and parallel server processes may differ, depending on what kind of operation they perform (SQL, recovery, or propagation). For example, if all parallel server processes in the pool are occupied and the maximum number of parallel server processes has been started:

For a given session, the parallel coordinator coordinates only one kind of operation. A parallel coordinator cannot coordinate, for example, parallel SQL and parallel propagation or parallel recovery at the same time.

See "Performing Recovery in Parallel" on page 28-13 for general information about parallel recovery.

Additional Information:

See Oracle8 Backup and Recovery Guide for detailed information about parallel recovery, and see Oracle8 Replication for information about parallel propagation.

 




Prev

Next
Oracle
Copyright © 1997 Oracle Corporation.

All Rights Reserved.

Library

Product

Contents

Index