XQuery and XQSE Developer’s Guide

     Previous  Next    Open TOC in new window    View as PDF - New Window  Get Adobe Reader - New Window
Content starts here

XQuery Engine and SQL

This chapter provides an overview of how Oracle Data Service Integrator works with relational data, and describes what happens when a relational data source is imported into Oracle Data Service Integrator.

The chapter also explains how SQL data types are mapped to XQuery data types and describes what happens during runtime after deploying a data-service-enabled application. The chapter further explains how queries are handled and describes the kind of performance you can expect.

This chapter covers the following topics:

Note that while the graphical-user interface tools handle many of the details, SQL developers and application-performance tuning experts should understand how Oracle Data Service Integrator works with relational data so that they can:

Note: For simplicity’s sake, this chapter refers to the XQuery engine throughout when in fact some of the specific functionality is handled by other, ancillary sub-systems (for example, the Data Source API or other system components depicted in the “Oracle Data Service Integrator Components Architecture” figure in the Concepts Guide).

 


Introduction

At the core of Oracle Data Service Integrator is the data processing engine, often referred to as simply the XQuery engine—the robust, enterprise-class implementation of the XQuery language based on the standards listed in “Supported XQuery Specifications” on page 1-3, with additional enhancements as detailed in “Oracle’s XQuery Implementation” on page 2-1.

In addition to compliance with XQuery and XML recommendations, Oracle Data Service Integrator XQuery engine also complies with the ANSI/ISO standard that bridges the SQL and XML worlds (the “SQL/XML (ISO-ANSI Working Draft) XML-Related Specifications” WD 9075-14 (SQL/XML), August, 2002). As a Java application (J2EE server application), Oracle Data Service Integrator uses JDBC to generate SQL queries and submit them to the appropriate RDBMSs that comprise a data service, which means Oracle Data Service Integrator must accommodate differences in both SQL and JDBC, as follows:

Given these factors, Oracle Data Service Integrator provides two different levels of SQL support for relational database management systems (RDBMS): base support and core support, as defined in the next section.

Base and Core RDBMS Support

Oracle Data Service Integrator provides two different levels of support for relational data sources:

How the XQuery Engine Supports SQL Data Sources

Oracle Data Service Integrator supports SQL (relational) data sources throughout the life-cycle of a data services project, from metadata import, through query plan optimization, through runtime execution of queries and delivery of data to an end-user (or other) application. Specifically, the XQuery engine provides:

Metadata and Data Type Mappings Get Stored in Annotated Files

For each of the tables and views whose metadata is imported into Oracle Data Service Integrator (using Import Source Metadata feature of the GUI), two files are generated:

Runtime Connection Management—Connection Sharing

At runtime, the XQuery engine:

Database connections (connection pools) are registered in the JNDI (Java naming and directory interface) tree of the WebLogic Server (an administrator with privileges on the server can configure connection pool, data source, and JNDI name by which connection pools are accessible).

When sub-plan execution completes, connections are typically not released back to the WebLogic Server. The XQuery engine holds the connection for the duration of the entire XQuery — not just the duration of the SQL — enabling subsequent queries to the same relational data source to be executed using an already obtained connection (which also improves performance). Whether the XQuery engine can share connections or not depends on the underlying data source and JDBC driver (see Table 3-2).

If the data source RDBMS or JDBC driver does not support connection sharing, and if the Oracle Data Service Integrator has opened multiple connections to the same data source, the XQuery engine keeps the initial connection to a data source open during XQuery execution but releases any subsequent connections to the same data source once the SQL result is received in its entirety by the XQuery engine. The initial connection will be re-used subsequent SQL queries when the connection becomes available.

Table 3-2 Runtime Connection Management
RDBMS
Support
Base RDBMS
No connection sharing.
IBM DB2/NT 8 (and higher)
Microsoft SQL Server 2000 (and higher)
Oracle 8.1.x, 9.x, 10.x
Sybase Adaptive Server Enterprise 12.5.2 (and higher)
Single shared connection for each JNDI data source; each connection supports multiple active SQL queries.
PointBase 5.1
Teradata V2R5
No connection sharing. Each access requires dedicated connection.

 


XQuery-SQL Data Type Mappings

XQuery-SQL data type mappings are specific to the RDBMS version and the JDBC driver, as discussed in Base and Core RDBMS Support. The specific data type mappings for each core RDBMS and the general mappings for any base RDBMS are detailed in the “XQuery-SQL Mapping Reference.” However, XQuery and SQL differ in some respects that may affect XQuery-to-SQL translation; these differences apply to all RDBMSs:

Date and Time Data Type Differences: Timezones and Time Precision

The XQuery language defines richer data types than SQL for handling date and time information (temporal data). These data types provide more information (timezone data, for instance) or greater degree of precision (unlimited number of fractional seconds as part of a time or date, for example). The three built-in XQuery data types for data and time information are:

Minimally, every RDBMS has a single datatype that conveys both date and time data. This datatype maps to XQuery’s xs:dateTime data type. Some RDBMSs offer additional SQL data types for storing date and time data separately (see Table 3-3)

(Of all the RDBMSs supported by Oracle Data Service Integrator, only Oracle 9.x (and higher) offers data types with timezone data (TIMESTAMP WITH TIMEZONE, TIMESTAMP WITH LOCAL TIMEZONE).

Table 3-3 Temporal Data Type Mappings
 
xs:date
xs:dateTime
xs:time
Base RDBMS
Reported by JDBC driver for the specific RDBMS.
IBM DB2/NT 8
DATE
TIMESTAMP
TIME
Microsoft SQL Server 2000
 
DATETIME1, SMALLDATETIME2
 
Oracle 8.1.x
 
DATE3
 
Oracle 9.x, 10.x
 
DATE, TIMESTAMP, TIMESTAMP WITH LOCAL TIMEZONE, TIMESTAMPWITH TIMEZONE
 
PointBase 5.1
DATE
TIMESTAMP
TIME
Sybase Adaptive Server Enterprise 12.5.2 (and higher)
DATE
SMALLDATETIME,2 DATETIME1
TIME

1Supports fractional seconds up to 3 digits (miliseconds).

2Accuracy of 1 minute.

3Provides both date and time data, but supports neither fractional seconds nor timezone data (fractional-second data is truncated).

Oracle Data Service Integrator XQuery engine maps all SQL date and time data types to XQuery data types (for example, during metadata import of a new data source) without loss of data or precision.

However, the converse is not true: depending on the specific RDBMS (and JDBC driver) for a specific data source, the XQuery engine may need to perform additional processing to minimize data loss and to handle the timezone information when mapping XQuery temporal data types to SQL.

How Oracle Data Service Integrator Handles Timezone Information

When a query is being pushed down to an RDBMS that does not support timezone data, the Oracle Data Service Integrator XQuery engine converts date and time data into the local time of the underlying application server and removes the timezone information. The conversion occurs each time a date or time value that includes timezone data is sent to the data source, as follows:

How Oracle Data Service Integrator Handles Fractional Seconds

The XQuery language supports unlimited precision for fractional seconds, while the Oracle Data Service Integrator XQuery engine supports up to 7 digits only (for fractional seconds). However, depending on the specific RDBMS, fractional second support may be far less than 7 digits. Or there may be no fractional second support at all (Oracle 8.1.x, for example). In translating from XQuery to SQL, Oracle Data Service Integrator truncates fractional seconds to the precision supported by that RDBMS.

For example, since Microsoft’s DATETIME data type supports up to 3 digits (milliseconds) for fractional time precision, when Oracle Data Service Integrator sends a datetime value to Microsoft SQL Server 2000, the value is first converted into the local time zone and then any fractional seconds are converted to the 3-digit-milliseconds allowed.

If fractional-second-precision is required (but the data source does not support it appropriately), use the fn-bea:fence() function to disable pushdown of date and time data types and operations, so that the XQuery engine processes the time- and date-related queries. (See Preventing SQL Pushdown on page 3-33 for more information.)

See “XQuery-SQL Mapping Reference” for more information about time and date data types for core and base RDBMS.

Scope Differences for Expressions and Data Types

The XQuery language is less restrictive than the SQL language in terms of the scope of expressions and data types. For example, for most all RDBMSs, an SQL query that returns a boolean can only be used inside a WHERE clause. XQuery does not have such restrictions, and as a result, in some cases, valid XQuery expressions cannot be pushed down. Expressions and data types that cannot be pushed include:

 


SQL Pushdown: Performance Optimization

Oracle Data Service Integrator achieves optimal performance for queries by performing SQL pushdown. Pushdown is an optimization technique that offloads processing from the XQuery engine by sending native SQL queries to the data source so that minimal result sets necessary to answer the query get processed by the XQuery engine.

SQL pushdown reduces the amount of data transported and processed by Oracle Data Service Integrator XQuery processing engine. This technique dramatically improves overall performance, especially when joining tables.

For example, a JOIN operation on two tables can be done by the underlying RDBMS, returning only the final result, rather than delivering all the data to the XQuery engine for processing the JOIN condition. Sorting criteria are also handled by the data source, eliminating the need to re-sort the data inside the XQuery engine.

For all core RDBMSs, the XQuery engine identifies the XQuery constructs and operations that can be translated into equivalent SQL operations. These include:

Not all queries can (or should) get pushed down. The XQuery engine does not pushdown:

The remainder of this section covers SQL pushdown in more detail, providing syntax samples based on the table structures shown in Figure 3-4. (For ease of reading, namespace references are not shown in the example queries.) In some cases, the query may not get pushed down as SQL, but the fragments of the query — names of columns, for example — may get pushed to the project list.

Figure 3-4 Table Structures for SQL Pushdown Examples

Table Structures for SQL Pushdown Examples

Function and Operator Pushdown

XQuery functions and operators are translated into SQL only when:

Table 3-5 shows an XQuery statement and its corresponding “pushdown” or SQL translation. (Oracle syntax is used.)

Table 3-5 Function Pushdown Example
XQuery Statement
SQL Translation (Oracle Syntax)
for $c in CUSTOMER()
return lower-case($c/LAST_NAME)
SELECT LOWER(t1."LAST_NAME") AS c1
FROM "CUSTOMER" t1

If some arguments to a function or operator are not directly pushable, but can be replaced with parameters, the XQuery engine will replace the arguments with parameters and pushdown the SQL. For example, since the XQuery’s string-join() function has no explicit SQL equivalent, it is replaced with a parameter (see Table 3-6). 

Table 3-6 External Variable Pushdown
XQuery Statement
SQL Statement
declare variable $p as xs:string external;
...
for $c in CUSTOMER()
where starts-with($c/LAST_NAME, string-join( ("a", "b"), $p ))
return $c/FIRST_NAME
SELECT t1."FIRST_NAME" AS c1
FROM "CUSTOMER" t1
WHERE t1."LAST_NAME" LIKE ?

Aggregate Functions

Oracle Data Service Integrator translates XQuery 1.0 and XPath 2.0 aggregate functions into corresponding SQL aggregate functions (Table 3-7).  

Table 3-7 Aggregate Functions
XQuery Aggregate Function
SQL Aggregate Function
fn:avg()
AVG()
fn:count()
COUNT()
fn:max()
MAX()
fn:min()
MIN()
fn:sum()
SUM()
fn:count(fn:distinct-values()
COUNT(DISTINCT …)

Note that the distinct-values() XQuery aggregate function in conjunction with the fn:count() function is further translated into an SQL COUNT(DISTINCT...) operation, as shown in Table 3-8. See Grouping and Aggregation for some examples of how aggregate functions in conjunction with other expressions affect the outcome of SQL pushdown.

Parameters in Generated SQL Statements

The Oracle Data Service Integrator XQuery engine generates parameters from variables, functions, operators, and cast operations as needed for use by the SQL engine. If all arguments to a function are parameters, the entire function gets pushed as a parameter.

The functions that can be pushed down depend on the database. See the “XQuery-SQL Mapping Reference” on page 7-1 for details.

Cast Operation Pushdown

As with functions and operators, support for cast operation pushdown is RDBMS-specific, although cast pushdown is available only for core (not base) RDBMSs. The XQuery engine can pushdown cast operations if the data source RDBMS:

Table 3-8 shows an example of how a cast in XQuery would get pushed down to a Microsoft SQL Server 2000 data source. 

Table 3-8 Cast Operation Pushdown
XQuery Statement
SQL Statement (Microsoft SQL Server 2000 Syntax)
for $c in CUSTOMER()
where xs:string($c/ZIP_CODE) eq "95131"
return $c/CUSTOMER_ID
SELECT t1."CUSTOMER_ID" AS c1
FROM "CUSTOMER" t1
WHERE CAST(t1."ZIP_CODE" AS VARCHAR) = '95131'

Path Expressions Pushdown

The XQuery engine maps table columns to XML elements that are children of the corresponding row elements. Simple XQuery path expressions are recognized by the XQuery engine as column accessors. For example, $c/ZIP_CODE and $c/LAST_NAME (see Table 3-10) provide access to ZIP_CODE and LAST_NAME columns.

Constant Pushdown

The Oracle Data Service Integrator XQuery engine translates XQuery constants into SQL constants only if the data source has an equivalent SQL data type. Table 3-9 shows an example of a constant used in a FLWOR expression and how that constant gets translated in the SQL statement.

Table 3-9 SQL Pushdown for Constants
XQuery Statement
SQL Statement
for $c in CUSTOMER()
where $c/ZIP_CODE eq 95131
return $c/LAST_NAME
SELECT t1."LAST_NAME" AS c1
FROM "CUSTOMER" t1
WHERE t1."ZIP_CODE" = 95131

Variable Pushdown

Both external and internal variables in XQuery expressions can be translated into SQL parameters (in generated SQL statements) when the variable’s datatype is supported by the XQuery engine and:

Table 3-10 shows an example of variable pushdown.

Table 3-10 Variable Pushdown
XQuery Statement
SQL Statement
declare variable $extVar
as xs:string external;
for $c in CUSTOMER()
where $c/CUSTOMER_ID eq $extVar
return $c/LAST_NAME
SELECT t1."LAST_NAME" as c1
FROM "CUSTOMER" t1
WHERE t1."CUSTOMER_ID" = ?

Common Query Patterns

For each relational data source, the precise set of expressions pushed down depends on the capabilities of the underlying RDBMS; for details, see XQuery Engine and SQL.

Simple Projection Queries

Each of the example XQueries shown in Table 3-11 returns elements containing values of LAST_NAME columns from a CUSTOMER table. In all cases, the SQL statement generated by the XQuery engine is the same (see Table 3-11).

Table 3-11 Projection Query
XQuery Statements
SQL Statement
for $c in CUSTOMER() return $c/LAST_NAME
SELECT t1."LAST_NAME" AS 
c1 FROM "CUSTOMER" t1
CUSTOMER()/LAST_NAME
for $c in CUSTOMER() return data($c/LAST_NAME)
data(CUSTOMER()/LAST_NAME)

The difference between the first two queries and the last two queries is that the fn:data() function is used in the query to limit the results to values only. Without the fn:data() function, the result is a list of <LAST_NAME> elements containing corresponding column values. If a column value is NULL, the element is skipped. With the fn:data() function, the result is the actual values.

Where Clause Pushdown

An XQuery where clause is usually translated into an SQL WHERE clause. An XQuery where clause gets pushed down as SQL when:

Table 3-12 shows an example of a where clause pushdown.

Table 3-12 Where Clause Pushdown
XQuery Statements
SQL Statements
for $c in CUSTOMER()
where $c/CUSTOMER_ID eq “CUSTOMER01”
return $c/LAST_NAME
SELECT t1.”LAST_NAME” AS c1
FROM “CUSTOMER” t1
WHERE t1.”CUSTOMER_ID” = ‘CUSTOMER01’
for $c in CUSTOMER()
where year-from-dateTime($c/BIRTH_DAY)
eq
year-from-date(current-date())
return
$c/LAST_NAME
(DB2 syntax)
SELECT t1.”LAST_NAME” AS c1
FROM “CUSTOMER” t1
WHERE
YEAR(t1.”BIRTH_DAY”) = ?

However, note that if the WHERE clause follows a group by clause, the WHERE clause is translated into a HAVING clause. See Group-By with a Nested Where Clause Translates to SQL HAVING Clause).

Order By Clause Pushdown

An XQuery order by expression comprises:

The XQuery engine can pushdown SQL for ordering expressions, including properties, only when the ordering expression:

Table 3-13 shows an example of an order by clause pushdown.

Table 3-13 Order By Pushdown
XQuery Statement
SQL Statement
for $c in CUSTOMER()
 order by $c/CUSTOMER_ID  descending
return $c/CUSTOMER_ID
SELECT t1."CUSTOMER_ID" AS c1
FROM "CUSTOMER" t1
ORDER BY t1."CUSTOMER_ID" DESC

Table 3-14 shows an example of the SQL pushdown that occurs when ordering by a NULLable column (ADDRESS2) in the XQuery clause when the RDBMS supports dynamic setting of NULL order.

Table 3-14 Order By Query When Setting NULL Order Dynamically
XQuery Statement
SQL Statement (Oracle Syntax)
for $c in CUSTOMER()
order by $c/ADDRESS2 ascending
empty greatest
return $c/CUSTOMER_ID, $c/ADDRESS2
SELECT t1."CUSTOMER_ID" AS c1,
t1."ADDRESS2" AS c2
FROM "CUSTOMER" t1
ORDER BY t1."ADDRESS2" ASC NULLS LAST

If the data source RDBMS does not support the required empty (NULL) order, the order by will not be pushed down.

As another optimization, the Oracle Data Service Integrator XQuery engine can insert order by clauses into generated SQL statements—even when the original XQuery statement does not include them—to offload expensive sorting operations to the RDBMS. They are automatically inserted by the XQuery optimizer prior to execution. You can see these as well in the Query Plan View.

Inner Join Pushdown

Joining data from multiple sources is a very common data integration task. In SQL terms, an inner join relates each row in one table (or view) to one or more corresponding rows in another table or view. In XQuery, an inner join is expressed as a FLWR expression comprising several for clauses that iterate over the data sources, where clauses that specify the join predicates, and a return clause returning data values.

If two relational sources are located in the same database, the inner join can sometimes be pushed down as a single SQL statement using either SQL-92 or SQL-89 syntax, depending on the RDBMS of the data source.

An inner join can be pushed down when:

Although the example in Figure 3-15 shows a simple inner join between two branches, the XQuery engine also supports n-way joins, with each branch comprising a different for statement. See also Table 3-16.

Table 3-16 Rendering of XQuery Inner-Join as SQL-92 and SQL-89 Syntax
SQL-92 Syntax
SQL-89 Syntax
SELECT t1."LAST_NAME" AS c1, t2."ORDER_ID" AS c2
FROM "CUSTOMER" t1 JOIN "CUST_ORDER" t2
ON t1."CUSTOMER_ID" = t2."CUSTOMER_ID"
SELECT t1."LAST_NAME" AS c1, t2."ORDER_ID" AS c2
FROM "CUSTOMER" t1, "CUST_ORDER" t2
WHERE t1."CUSTOMER_ID" = t2."CUSTOMER_ID"

Outer Join Pushdown

The XQuery engine interprets nested FLWR expressions (see Figure 3-17) as an outer join and can generate SQL for a data source when:

The SQL code generated by the XQuery engine depends on the SQL dialect supported by the source database (see “XQuery-SQL Mapping Reference” for details). Table 3-18 shows example SQL-92 and proprietary syntax for the query shown in Figure 3-17.

Table 3-18 SQL-92 and Proprietary Outer Join Syntax Comparison
SQL-92 Syntax
Oracle 8 Syntax
SELECT t1."LAST_NAME" AS c1, t2."ORDER_ID" AS c2
FROM "CUSTOMER" t1 OUTER JOIN "CUST_ORDER" t2
ON t1."CUSTOMER_ID" = t2."CUSTOMER_ID"
SELECT t1."LAST_NAME" AS c1, t2."ORDER_ID" AS c2
FROM "CUSTOMER" t1, "CUST_ORDER" t2
WHERE t1."CUSTOMER_ID" = t2."CUSTOMER_ID" (+)

Variations of the outer-join pattern are obtained from the original query by using equivalent XQuery expressions. Figure 3-19 is an example of a query equivalent to that shown in Figure 3-17 that will also result in a SQL statement with an outer join.

Figure 3-19 Outer Join Pattern

Outer Join Pattern

Semi-Joins and Anti-Semi-Joins

A semi-join returns data from a single branch of the join condition, when the join condition is satisfied. An anti-semi-join returns data from a single branch when the join condition is false. Although the XQuery language does not have specific constructs for semi-joins and anti-semi-joins, the XQuery engine translates several specific FLWR patterns into SQL semi-join or anti-semi-join patterns, assuming that:

The XQuery interprets a FLWR query containing an inner existential quantified expression as a semi-join, translating the expression into an SQL query with the EXISTS check in the WHERE clause.

Universal quantified expressions are also supported, but their SQL generation is slightly more complicated. The XQuery engine translates FLWRs with exist() or empty() predicates in the where clause into semi-joins. Table 3-20 shows several examples of such patterns.

Table 3-20 Various XQuery Patterns that Can Generate Semi-Join and Anti-Semi-Join SQL
 
XQuery Statement
SQL Statement
FLWR with existential (“some”) quantifier [semi-join]
for $customer in CUSTOMER()
where
some $c_order in CUST_ORDER()
satisfies ($customer/CUSTOMER_ID eq $c_order/ORDER_ID)
and
($c_order/STATUS eq "OPEN")
return
$customer/CUSTOMER_ID
SELECT t1."CUSTOMER_ID" AS c1
FROM "CUSTOMER" t1
WHERE EXISTS(
SELECT 1
FROM "CUST_ORDER" t2
WHERE t1."CUSTOMER_ID" = t2."CUSTOMER_ID" AND t2."STATUS" = 'OPEN'
)
FLWR with negation of existential quantifier [anti-semi join]
for $customer in CUSTOMER()
where not(
some $c_order in CUST_ORDER()
satisfies ($customer/CUSTOMER_ID eq $c_order/ORDER_ID)
and
($c_order/STATUS eq "OPEN")
)
return
$customer/CUSTOMER_ID
SELECT t1."CUSTOMER_ID" AS c1
FROM "CUSTOMER" t1
WHERE NOT EXISTS(
SELECT 1
FROM "CUST_ORDER" t2
WHERE t1."CUSTOMER_ID" = t2."CUSTOMER_ID" AND t2."STATUS" = 'OPEN'
)
FLWR with universal (“every”) quantified expression
for $customer in CUSTOMER()
where
every $c_order in CUST_ORDER()
satisfies ($customer/CUSTOMER_ID eq $c_order/ORDER_ID) and
($c_order/STATUS eq "OPEN")
return
$customer/CUSTOMER_ID
SELECT t1."CUSTOMER_ID" AS c1
FROM "CUSTOMER" t1
WHERE NOT EXISTS(
SELECT 1
FROM "CUST_ORDER" t2
WHERE NOT(t1."CUSTOMER_ID" = t2."CUSTOMER_ID" AND t2."STATUS" = 'OPEN')
)
FLWR with exists() predicate
or $customer in CUSTOMER()
where exists(
for $c_order in CUST_ORDER()
where ($customer/CUSTOMER_ID eq $c_order/ORDER_ID) and
($c_order/STATUS eq "OPEN")
return $c_order
)
return
$customer/CUSTOMER_ID
SELECT t1."CUSTOMER_ID" AS c1
FROM "CUSTOMER" t1
WHERE EXISTS(
SELECT 1
FROM "CUST_ORDER" t2
WHERE t1."CUSTOMER_ID" = t2."CUSTOMER_ID" AND t2."STATUS" = 'OPEN'
)
FLWR with empty() predicate
for $customer in CUSTOMER()
where empty(
for $c_order in CUST_ORDER()
where ($customer/CUSTOMER_ID eq $c_order/ORDER_ID) and
($c_order/STATUS eq "OPEN")
return $c_order
)
return
$customer/CUSTOMER_ID
SELECT t1."CUSTOMER_ID" AS c1
FROM "CUSTOMER" t1
WHERE NOT(EXISTS(
SELECT 1
FROM "CUST_ORDER" t2
WHERE t1."CUSTOMER_ID" = t2."CUSTOMER_ID" AND t2."STATUS" = 'OPEN'
))

Grouping and Aggregation

The XQuery engine supports several patterns for group by pushdown and aggregate function pushdown.

Group By Pushdown

The Group By clause is a Oracle extension to the XQuery language (see “Generalized FLWGOR (group by)” on page 2-52 for more information). The XQuery engine implicitly adds a group by expression to some patterns to enable more efficient pushdown and query execution.

Figure 3-21 XQuery Containing a Group By

XQuery Containing a Group By

The XQuery engine translates group-by clauses into equivalent SQL GROUP BY clauses if:

Since the query shown in Figure 3-21 meets these requirements, the following SQL statement is generated:

SELECT t1."CATEGORY" AS c1, COUNT(*) AS c2
FROM "PRODUCT" t1
GROUP BY t1."CATEGORY"

The group-by pushdown is closely related to the Distinct-by Pushdown. When a group-by clause does not include a partition variable, the XQuery engine generates SQL that includes the DISTINCT keyword, as described in the next section.

Distinct-by Pushdown

An XQuery containing a Group By clause (without a partition definition), can be generated into SQL query that uses SQL’s DISTINCT keyword to eliminate duplicates in the result. For example, the XQuery statement in Table 3-22 uses a group-by clause but has no partition defined, and the SQL statement created by Oracle Data Service Integrator refines the result by using the DISTINCT keyword. 

Table 3-22 Distinct By Pushdown
XQuery Statement
SQL Statement
for $product in PRODUCT()
group by $product/CATEGORY_ID as $category
return $category
SELECT DISTINCT t1."CATEGORY_ID" AS c1
FROM "PRODUCT" t1

Simple Aggregate Pattern

An aggregate function operating on a single column from a data source is one of the simplest aggregate patterns that the XQuery engine supports, although it does so in a slightly non-intuitive way. It uses a constant as a single grouping expression (...GROUP ...BY n). The XQuery engine can pushdown the SQL if the RDBMS supports either a GROUP BY operation on a constant or supports sub-queries in the sub-clause (see Table 3-23). 

Table 3-23 Aggregate Pushdown
XQuery Statement
SQL Statement1
SQL Statement2
for $product in PRODUCT()
group $product
as $price_group
by 1
return min($price_group/
LIST_PRICE)
SELECT MIN(t1."LIST_PRICE") AS c1
FROM "PRODUCT" t1
GROUP BY 1
SELECT MIN(t2.c2) AS c3
FROM (
SELECT 1 AS c1, t1."LIST_PRICE" AS c2
FROM "PRODUCT" t1
) t2
GROUP BY t2.c1

1RDBMS supports GROUP BY constant

2RDBMS does not support GROUP BY, but does support sub-queries in the FROM clause

Group-By with a Nested Where Clause Translates to SQL HAVING Clause

If a relational data source supports nested WHERE clauses, the XQuery engine can translate a where clause after a group-by clause into a SQL HAVING clause (see Table 3-24), provided that the where clause meets other requirements for XQuery-SQL translation.

Table 3-24 Nested WHERE Clauses
XQuery Statement
SQL Statement
for $product in PRODUCT()
group $product/LIST_PRICE as $price_group
by $product/CATEGORY as $category
where max($price_group) gt 1000
return
<t>
{
$category,
min($price_group)
}
</t>
SELECT t1."CATEGORY" AS c1, MIN(t1."LIST_PRICE") AS c2
FROM "PRODUCT" t1
GROUP BY t1."CATEGORY"
HAVING MAX(t1."LIST_PRICE") > 1000

Outer Join with Aggregate Pattern

Another common pattern supported by the XQuery engine is outer join with aggregation of the right branch, which is expressed in XQuery as nested FLWR expressions with aggregate functions in the inner level (Table 3-25).

Table 3-25 Outer Join with Aggregate
XQuery Statement
SQL Statement
for $customer in CUSTOMER()
return
<customer>
<name>{ data($customer/LAST_NAME) }</name>
<order-amount>
{
sum(
for $c_order in CUST_ORDER()
where $customer/CUSTOMER_ID eq $c_order/CUSTOMER_ID
return $c_order/ORDER_AMOUNT
)
}
</order-amount>
</customer>
SELECT t1."LAST_NAME" AS c1, SUM(t2."ORDER_AMOUNT") AS c2
FROM "CUSTOMER" t1
LEFT OUTER JOIN "CUST_ORDER" t2
ON (t2."CUSTOMER_ID" = t1."CUSTOMER_ID")
GROUP BY t1."CUSTOMER_ID"

With this type of query, in order to fully push as much of the query as possible to the data source RDBMS, the XQuery engine evaluates the outer join first and then performs the group-by on the left branch’s primary key column, to compute the aggregate. The XQuery engine can perform this optimization only if the left branch of the query has a key column. As shown in Table 3-26, the CUSTOMER does, so the optimization will be performed.

The net effect is that only the XML creation is performed in the XQuery engine.

If-Then-Else Pattern

The CASE expression, introduced in SQL:1992, provides a way to use if-then-else logic in SQL statements without having to invoke procedures. The CASE expression correlates a list of values and alternatives.

An XQuery if-then-else pattern can be translated into an SQL CASE expression if:

The then and else expressions can contain (or fully consist of) parameters. If the if-then-else expression does not depend on the data source, the entire expression is pushed as a parameter.

An example can be seen in Table 3-26.

Table 3-26 If-Then-Else Pushdown
XQuery Statement
SQL Statement
for $i in CUST_ORDER()
return
if ($i/STATUS eq "SHIPPED")
then data($i/STATUS)
else data($i/CUSTOMER_ID)
SELECT
CASE WHEN (t1."STATUS" = 'SHIPPED')
THEN t1."STATUS"
ELSE t1."CUSTOMER_ID" END AS c1
FROM "CUST_ORDER" t1

Subsequence Pushdown

In the typical RDBMS application, it is quite common to paginate the results — output just 20 customer records per page, for example, for printing or other purposes. XQuery meets this need with its subsequence( ) function. XQuery provides two different subsequence functions, shown in Table 3-27 and in Table 3-28.

Table 3-27 Two- and Three-Argument Variants of XQuery Subsequence Function
Two-argument variant
Three-argument variant
fn:subsequence(
 $sourceSeq as item()*,
 $startingLoc as xs:double
) as item()*
fn:subsequence(
 $sourceSeq as item()*,
 $startingLoc as xs:double,
 $length as xs:double
) as item()*

Table 3-28 Subsequence Pushdown
XQuery Statement
SQL Statement (DB2)
let $s :=
  for $i in t2:PRODUCT()
 order by $i/LIST_PRICE descending
 return $i
for $p in subsequence($s, 1, 10)
return <product>
  <name>
   { data($p/PRODUCT_NAME) } </name>
  <price>
   { data($p/LIST_PRICE) }
  </price>
</product>
};
SELECT t3.c1, t3.c2 FROM(
 SELECT ROW_NUMBER() OVER()
     as c3, t2.c1, t2.c2
  FROM(
   SELECT t1.”LIST_PRICE” as c1,
    t1.“PRODUCT_NAME” as c2
    FROM “RTLALL”.”PRODUCT” t1
    ORDER BY t1.”LIST_PRICE” DESC
   )t2
  )t3
WHERE(t3.c3 <11)

The two-argument variant returns the remaining items of an input sequence, starting from the $startingLoc. The three-argument variant returns $length items of the input sequence starting from the $startingLoc. Table 3-29 shows several different examples of the subsequence function in the context of specific queries.

Table 3-29 Examples of XQuery Expressions using Subsequence Function  
Query statement
XQuery Expression
Return the 10 most expensive products only.
let $s :=
  for $i in PRODUCT()
  order by $i/LIST_PRICE descending
  return $i
for $p in subsequence($s, 1, 10)
return <product>
  <name> { data($p/PRODUCT_NAME) } </name>
  <price> { data($p/LIST_PRICE) } </price>
</product>
Return all service cases opened against each of the 10 most expensive products (outer join).
let $s :=
  for $i in PRODUCT()
  order by $i/LIST_PRICE descending
  return $i
for $p in subsequence($s, 1, 10)
return <product>
<name> { data($p/PRODUCT_NAME) } </name>
{
  for $sc in SERVICE_CASE()
  where $p/PRODUCT_ID eq $sc/PRODUCT_ID and
    $sc/STATUS = ‘Open’
  return <case>{ data($sc/CASE_ID) }</case>
}
</product>
Return the total number of service cases opened against each of the 10 most expensive products (aggregation).
let $s :=
  for $i in PRODUCT()
  order by $i/LIST_PRICE descending
  return $i
for $p in subsequence($s, 1, 10)
return
<product>
<name> { data($p/PRODUCT_NAME) } </name>
{
 let $scs :=
  for $sc in SERVICE_CASE()
  where $p/PRODUCT_ID eq $sc/PRODUCT_ID and $sc/STATUS = ‘Open’
  return $sc
 return <case_count>{ count($scs) }</case_count>
}
</product>

An XQuery subsequence pattern can be translated into an SQL subsequence expression if:

Oracle Data Service Integrator can pushdown the subsequence pattern to the underlying RDBMS, thereby enhancing performance, as long as the underlying RDBMS supports it.

Note: Subsequence pushdown is not supported for PointBase, Sybase, or any base RDBMS (see “XQuery-SQL Mapping Reference” on page 7-1 for other core and base RDBMS information.)

Direct SQL Data Services and Pushdown

Oracle Data Service Integrator lets you create data services not only from relational tables and views, but also from SQL queries. These direct SQL data services, as they are called, can also be composed by the XQuery engine, and pushed down as native SQL to the target RDBMS, if:

If the RDBMS does not support sub-queries (the FROM clause), the pushdown will not occur.

For example, a user-defined SQL query, “recent_order” is configured as a relational source:

SELECT * from RECENT_ORDER

The XQuery that gets created in the data service and the resulting generated SQL that gets pushed down by the XQuery engine are shown in Table 3-30.  

Table 3-30 Direct SQL Data Service Example
XQuery Statement
SQL Statement
declare variable
$external_variable as xs:string external;
for $recent_order in RECENT_ORDER()
where $recent_order/ORDER_ID eq $external_variable
return $recent_order/ORDER_AMOUNT
SELECT t1."ORDER_AMOUNT" AS c1
FROM (
SELECT * FROM RECENT_ORDER
) t1
WHERE t1."ORDER_ID" = ?

SQL pushdown on top of direct SQL is not limited to simple select-project queries. Any operation for which pushdown is supported for table and view sources is also supported for data services created for direct SQL queries. For example, Table 3-31 shows a join query and its generated result. 

Table 3-31 Direct SQL Data Service with Join Condition
XQuery Statement
SQL Statement
for $customer in CUSTOMER()
for $recent_order in RECENT_ORDER()
where $customer/CUSTOMER_ID eq $recent_order/CUSTOMER_ID
return
<t>{ $customer/CUSTOMER_ID, $recent_order/ORDER_ID }</t>
SELECT t1."CUSTOMER_ID" AS c1, t2."ORDER_ID" AS c2
FROM "CUSTOMER" t1
JOIN (
SELECT * FROM RECENT_ORDER
) t2
ON t1."CUSTOMER_ID" = t2."CUSTOMER_ID"

Distributed Query Pushdown

Oracle Data Service Integrator uses SQL pushdown to off-load query processing to the underlying data source RDBMS whenever possible. However, as mentioned in How the XQuery Engine Supports SQL Data Sources, SQL pushdown is not always possible, nor beneficial. For example, when two data sources are running on two different systems, or when a query combines relational data with non-relational data, SQL pushdown may not provide any performance benefit.

In cases such as these, Oracle Data Service Integrator uses special techniques to batch-process the outside portion of a query (the left branch) and send a cluster (or chunk) of data to the right branch as parameters (see Table 3-32).

The XQuery engine chooses this optimization technique (a “clustered parameter passing join,” also known as PPK) for a distributed query when:

Unless all these conditions are met, the XQuery engine cannot use this optimization technique but will instead use the single parameter join instead (PP1 join).

 


Preventing SQL Pushdown

Developers can exercise control over SQL pushdown by using the fn-bea:fence() function (an Oracle extension to XQuery functions and operations) to demarcate sections of XQuery code that the XQuery engine should ignore when it is evaluating query fragments for SQL pushdown.

For the example shown in Table 3-33, even though the upper-case function could be pushed down to the RDBMS, its pushdown is blocked by the fence() function and the upper-case function will be executed by the XQuery engine. Only the fragment comprising the lower-case function is included in the query plan as SQL pushdown. The result of the SQL will be returned to the XQuery engine, which will use the XQuery upper-case function on the result.

Table 3-33 Using the fn-bea:fence() Function
XQuery Statement
SQL Statement
for $c in CUSTOMER()
return
upper-case(
fn-bea:fence(
lower-case( $c/LAST_NAME )
)
)
SELECT LOWER(t1."LAST_NAME") AS c1
FROM "CUSTOMER" t1

Use the fence() function whenever you want SQL to be sent as is, to the RDBMS. For example, if you are accessing an Oracle 8.5.x RDBMS that uses hints and Oracle’s rule-based optimizer, you should send the hinted SQL queries to the data source by wrapping them in the fence() function.

To circumvent SQL pushdown for specific clauses, extract those clauses into separate FLWOR expressions with the fence( ) function at the top of the clause, as shown here:

for $x in 
fn-bea:fence
 (
  for $c in CUSTOMER()
  return $c/LAST_NAME
 )
order by $x
return $x

As you develop data services that use relational data sources, use the Oracle Data Service Integrator Query Plan View to see the results of using the fence( ) function (Figure 3-34). In this example, the order by clause will be executed by the XQuery engine rather than pushed down as SQL.

Figure 3-34 Example of an XQuery Plan without (l) and with (r) the fn-bea:fence() Function

Note that the red triangles displayed in the SQL portions of Figure 3-34 are alerts calling attention to the fact that a where clause is missing from the XQuery statement.


  Back to Top       Previous  Next