Skip Headers
Oracle® Warehouse Builder User's Guide
10g Release 1 (10.1)

Part Number B12146-02
Go to Documentation Home
Home
Go to Book List
Book List
Go to Table of Contents
Contents
Go to Index
Index
Go to Feedback page
Contact Us

Go to previous page
Previous
Go to next page
Next
View PDF

8 Using Mapping Operators

This chapter provides details on how to use operators in a mapping to transform data and how to use the Expression Builder to create expressions. Operators are listed alphabetically.

This chapter includes the following topics:

The topics are listed in a columnar table that reads down the columns from left to right to conserve space.


N/A

Using the Expression Builder

Some of the data flow operators described in this chapter require that you create expressions. An expression is a statement or clause that transforms data or specifies a restriction. These expressions are portions of SQL that are used inline as part of a SQL statement. Each expression belongs to a type that is determined by the role of the data flow operator. You can create expressions using Expression Builder, or by typing them into the expression field located in the operator or attribute property windows.

Opening the Expression Builder

You can open the Expression Builder from the operator property windows in the operators such as filters, joiners, splitters, and aggregators.

You can open the Expression Builder from the attribute property windows in the operators such as expressions, data generators, and constants.

To open the Expression Builder:

  1. From the operator properties window or the attribute property window, click the ... button in the expression field. Figure 8-1 shows the attribute properties window.

    Figure 8-1 Attribute Property Window

    Surrounding text describes Figure 8-1 .

    The Expression Builder displays as shown in Figure 8-2.

    Figure 8-2 Expression Builder Interface

    Surrounding text describes Figure 8-2 .
  2. Create an expression by:

    • Typing text into the expression field.

    • Dragging items from the Inputs and Transformations tabs on the left panel and dropping them into the Expression field on the right.

    • Double clicking on items from the Inputs and Transformations tabs on the left panel.

    • Clicking arithmetic operator buttons available under the Expression field.

  3. Click Validate.

    This verifies the accuracy of the expression syntax.

  4. Click OK to save the expression and close the Expression Builder.

The Expression Builder User Interface

The Expression Builder contains the following parts:

  • In the left panel, the navigation tree displays two tabs:

    • Inputs Tab: A list of input parameters.

    • Transformations Tab: A list of predefined functions and procedures located in the Oracle Transformation Library, the Global Shared Library, and a custom Transformation Library. See the Oracle Warehouse Builder Transformation Guide for more information.

  • Expression Field: At the top of the right panel is the Expression field. Use this field to type and edit expressions.

  • Arithmetic Operator Buttons: Below the Expression field are buttons for arithmetic operators. Use these buttons to build an expression without typing. The arithmetic operators available vary by the type of data flow operator that is active.

  • Others: A drop-down list of available SQL clauses that are appropriate for the active expression type.

    Beginning in Oracle9i, the CASE function is recommended over the DECODE function because the CASE function generates both SQL and PL/SQL while DECODE is limited to SQL. If you use the DECODE function in an expression, Warehouse Builder promotes it to CASE where appropriate during code generation. This enables you to deploy the DECODE functionality in all operating modes (such as setbased or rowbased) and transparently across Oracle database releases (8.1, 9.0 and higher).

    For example, Warehouse Builder converts the function

    DECODE (T1.A, 1, 'ABC', 2, 'DEF', 3, 'GHI', 'JKL')
    

    to the following:

    CASE T1.A WHEN 1 THEN 'ABC'
    WHEN 2 THEN 'DEF'
    WHEN 3 THEN 'GHI'
    ELSE 'JKL'
    
  • Validation Results Field: At the bottom of the right panel is the Validation Results field. After you select the Validate button to the right if this field, this field displays the validation results.

  • Validate Button: Use this button to validate the current expression in the Expression Builder. Validation ensures that all mapping objects referred to by the expression have associated repository objects. The expressions you create with the Expression Builder are limited to the operator inputs and to any transformations available in a project. This limitation protects the expression from becoming invalid because of changes external to the operator. If the deployment database is different from the design repository, it may not accept the expression. If this happens, the expression may be valid but incorrect against the database. In this case, expression errors can only be found at deployment time.

Aggregator Operator

The Aggregator operator calculates data aggregations, such as summations and averages and provides an output row set with the aggregated data.

Surrounding text describes aggregator.gif.

Because each Aggregator operator shares a GROUP BY and HAVING clause, each attribute in the output group has the same cardinality. The number of rows in the output row set is less than or equal to the number of input rows.

The Aggregator operator has one input group and one output group. Connecting the source to the input group produces the corresponding aggregated row set in the output group. The resulting output includes the column used by the GROUP BY or the HAVING clause.

The Aggregator operator contains the following properties:

To use an Aggregator operator in a mapping:

  1. Drop an Aggregator operator onto the Mapping Editor canvas.

  2. On the canvas, connect source attributes to the input group of the Aggregator operator.

  3. Right-click the Aggregator and select Edit.

    Warehouse Builder displays the Operator Editor.

  4. On the Output Attributes tab, select Add to add output attributes.

  5. Click OK to close the Operator Editor.

  6. From the attribute properties window, define expressions for each output attribute. For detailed instructions, see "Defining the Output Attributes from Aggregator Operators".

  7. Define the Group By Clause for the operator. For detailed instructions, see "Defining the Output Attributes from Aggregator Operators".

Defining the Output Attributes from Aggregator Operators

To define expressions for output attributes:

  1. In the Aggregator operator on the mapping canvas, right click an output attribute and select Attribute Properties.

  2. Click the button to the right of the Expression property.

    The Expression dialog displays as shown in Figure 8-3.

    Figure 8-3 Expression Dialog

    Surrounding text describes Figure 8-3 .
  3. Select a Function, ALL or DISTINCT, and a parameter from the drop-down lists.

  4. Click OK.

To define the Group By Clause:

  1. In the Aggregator operator on the mapping canvas, right click an output attribute and select Attribute Properties.

  2. Click the button to the right of the Group By Clause property.

    The Group By Clause dialog displays as shown in Figure 8-4.

    Figure 8-4 Group By Clause Dialog

    Surrounding text describes Figure 8-4 .
  3. Move the attributes from the Available Attributes list to the GROUP BY Attributes list.

  4. Click OK.

  5. Click the button to define a HAVING clause using the Expression Builder.

  6. Create an expression such as sum(INGRP1.OR_TOTAL) > 10000 as shown in Figure 8-5.

    Figure 8-5 Expression Builder Showing a Sum Statement

    Surrounding text describes Figure 8-5 .
  7. Map the attributes you edited from the output group of the aggregator operator to the attributes in the target.

Constant Operator

The Constant operator enables you to define constants. Constants are initialized at the beginning of the execution of the mapping. Constant values can be used anywhere in a mapping.

Surrounding text describes constant.gif.

The Constant operator produces a single output group that can contain one or more constant attributes. For any defined constant data type, the output expression must be a valid SQL expression returning a value of the same data type. For VARCHAR, CHAR, or VARCHAR2 data types, you must enclose constant string literals within single quotes. For example, 'my_string'.

The Constant operator contains the following property:

To use a constant operator in a mapping:

  1. Drop a Constant operator onto the Mapping Editor canvas.

  2. Right-click the operator and select Edit.

    Warehouse Builder displays the Operator Editor.

  3. On the Output Attributes tab, select add to create an output attribute.

  4. Click OK to close the Operator Editor.

  5. On the operator in the mapping canvas, right-click an attribute and select Attribute Properties.

    The Attributes properties window displays.

  6. Enter an expression in the Expression field or click to define an expression using the Expression Builder shown in Figure 8-6. The length, precision, and scale properties assigned to the attributes of the Constant operator must match the actual values returned by the expressions defined in the mapping.

    Figure 8-6 Expression Builder Showing A Constant

    Surrounding text describes Figure 8-6 .
  7. Select OK.

  8. Connect the output attribute to the appropriate target attribute.

Data Generator Operator

Use a Data Generator operator to provide information such as record number, system date, and sequence values.

Surrounding text describes data_gen.gif.

Recommendation:

For PL/SQL mappings use a Constant Operator or Mapping Sequence Operator instead of a Data Generator.

For mappings with flat file sources, the Data Generator operator also provides a place to enter constant information. For mappings with Flat File sources and targets, the Data Generator operator connects the mapping to SQL*Loader to generate the data stored in the database record.

The following functions are available:

It is possible for Warehouse Builder to generate data by specifying only sequences, record numbers, system dates, and constants as field specifications. SQL*Loader inserts as many records as are specified by the LOAD keyword.

Setting a Column to the Data File Record Number

Use the RECNUM keyword to set an attribute to the number of the records that the record was loaded from. Records are counted sequentially from the beginning of the first data file, starting with record 1. RECNUM increments as each logical record is assembled. It increments for records that are discarded, skipped, rejected, or loaded. For example, if you use the option SKIP=10, the first record loaded has a RECNUM of 11.

Setting a Column to the Current Date

A column specified with SYSDATE gets the current system date, as defined by the SQL language SYSDATE function.

The target column must be of type CHAR or DATE. If the column is of type CHAR, then the date is loaded in the format dd-mon-yy. After the load, you only access it in that format. If the system date is loaded into a DATE column, then you can only access it in a variety of formats including the time and the date. A new system date/time is used for each array of records inserted in a conventional path load and for each block of records loaded during a direct path load.

Setting a Column to a Unique Sequence Number

The SEQUENCE keyword ensures a unique value for a column. SEQUENCE increments for each record that is loaded or rejected. It does not increment for records that are discarded or skipped.

The combination of column name and the SEQUENCE function is a complete column specification. Table 8-1 lists the options available for sequence values.

Table 8-1 Sequence Value Options

Value Description

column_name

The name of the column in the database that the sequence is assigned to.

SEQUENCE

Specifies the value for a column.

integer

Specifies the beginning sequence number.

COUNT

The sequence starts with the number of records already in the table plus the increment.

MAX

The sequence starts with the current maximum value for the column plus the increment.

incr

The value that the sequence number is to increment after a record is loaded or rejected.


If a record is rejected because of a format error or an Oracle error, the generated sequence numbers are not reshuffled to mask this. For example, if four rows are assigned sequence numbers 10, 12, 14, and 16 in a column, and the row with 12 is rejected, the three rows inserted are numbered 10, 14, and 16, not 10, 12, 14. The sequence of inserts is preserved despite data errors. When you correct the rejected data and reinsert it, you can manually set the columns to match the sequence.

Although the Data Generator operator has only one output group, it has predefined attributes corresponding to Record Number, System Date, and a typical Sequence. While modification of these attributes is not recommended, you can create new attributes. The Data Generator operator is only valid for a SQL*Loader mapping.

Note:

There can only be one Data Generator operator in a mapping.

The Data Generator attribute contains the following property:

  • Expression: Expression to use when you map this attribute. Make sure the value entered for the expression is valid SQL*Loader syntax.

To use a data generator in a mapping:

  1. Drop a Data Generator operator onto the Mapping Editor canvas.

  2. Right-click the operator and select Edit.

    Warehouse Builder displays the Operator Editor.

  3. Select the Output Attributes tab.

    Warehouse Builder displays the pre-defined output attributes RECNUM, SYS_DATE, and SEQUENCE.

  4. On the Output Attributes tab, define the properties and type an optional description for the predefined output attributes.

  5. Click OK to close the Operator Editor.

  6. On the operator in the mapping canvas, right-click the RECUM attribute and select Attribute Properties.

    Warehouse Builder displays the Attribute Properties window.

  7. In the expression field, click the ... to open the Expression Builder and define an expression.

  8. Repeat steps 6 and 7 for the SEQUENCE attribute.

Deduplicator Operator

The Deduplicator enables you to remove duplicate data in a source by placing a DISTINCT clause in the select code represented by the mapping.

Surrounding text describes deduplicator.gif.

To remove duplicates:

  1. Drop the Deduplicator operator onto the Mapping Editor canvas.

  2. Connect the attributes from the source operator to the group input/output of the Deduplicator operator.

  3. Connect the attributes from the Deduplicator operator group to the attributes of the target operator.

Expression Operator

The Expression operator enables you to write SQL expressions that define non-procedural algorithms for one output parameter of the operator.

Surrounding text describes expression.gif.

The expression text can contain combinations of input parameter names, variable names, and library functions. Use the Expression operator to transform the column value data of rows within a row set using SQL-type expressions, while preserving the cardinality of the input row set. To create these expressions, open the Attribute properties window for the output attribute and then open the Expression Builder.

The Expression operator contains only one input group and one output group. These are created automatically when you drop the operator onto the Mapping Editor canvas.

The output expressions for this operator cannot contain any aggregation functions. To use aggregation functions, use the Aggregator operator. See "Aggregator Operator".

Every output attribute in an Expression operator contains the following properties:

To use an expression operator in a mapping:

  1. Drop an Expression operator onto the Mapping Editor canvas.

  2. Connect the appropriate source attributes to the Expression input group.

    Warehouse Builder copies the input attributes into the operator.

  3. Right-click the operator in the mapping canvas and select Edit.

    Warehouse Builder displays the Operator Editor.

  4. On the Output Attributes tab, select Add and specify the attribute name, data type, and other properties.

  5. Click OK to close the Operator Editor.

  6. From the Expression operator, right-click the output attribute and select the Attribute Properties.

  7. Click the field to the right of the Expression property and enter a filter condition expression or click ... to open the Expression Builder and define an expression.

  8. Close the Attribute Properties window.

  9. Connect the Expression output attribute to the appropriate target attribute.

Filter Operator

You can conditionally filter out rows using the Filter operator.

Surrounding text describes filter.gif.

The Filter operator filters data from a source to a target by placing a WHERE clause in the code represented by the mapping. You connect a source operator to the Filter operator, apply a filter condition, and send a subset of rows to the next operator.

A Filter operator has only one input/output group that can be connected to both a source and target row set. The resulting row set is a filtered subset of the source row set based on a boolean filter condition expression.

The Filter operator contains the following property:

When you generate a mapping that includes a Filter operator, the Code Viewer displays filter condition expressions as a WHERE clause for set-based view mode. The filter input names in the original filter condition are replaced by actual column names from the source table, qualified by the source table alias.

To use a filter operator in a mapping:

  1. Drag and drop the Filter operator onto the Mapping Editor canvas.

  2. Connect source attributes to the input attribute for the Filter operator.

  3. Right-click the Filter operator header and select Operator Properties.

    The Filter Properties window displays.

  4. Click the field to the right of the Filter Condition property and enter a filter condition expression. Or click ... to define a filter condition using the Expression Builder as shown in Figure 8-7.

    Figure 8-7 Expression Builder Showing a Filter Condition

    Surrounding text describes Figure 8-7 .
  5. Click OK in the Expression Builder and close the Filter Properties window.

  6. Connect the Filter operator outputs to the input/output group in the target.

Joiner Operator

You can use the Joiner operator to join multiple row sets from different sources with different cardinalities, and produce a single output row set.

Surrounding text describes joiner.gif.

The Joiner operator uses a boolean condition that relates column values in each source row set to at least one other row set.

Note:

Operators placed between data sources and a Joiner can generate complex SQL or PL/SQL.

If the input row sets are related through foreign keys, that relationship is used to form a default join condition. You can use this default condition or you can modify it. If the sources are not related through foreign keys, then you must define a join condition.

If the default foreign keys result in duplicate WHERE clauses, the Joiner operator will remove the duplicate clauses. This can happen if the join condition references several foreign keys. For example, if table T1 has a foreign key FK1 point to unique key UK1 in table T2 and table T2 has a foreign key FK2 pointing to unique key UK2 in T1, the resulting join condition

T1.A = T2.A AND T1.B = T2.B /*All instances of FK1 -> UK1 are reduced to one where clause*/ AND
T2.B = T1.B AND T2.C = T1.C /*All instances of FK2 -> UK2 are reduced to one where clause*/

is generated by the Joiner operator as

T2.A = T2.A AND T1.B = T2.B AND T2.C = T1.C

The Joiner operator contains the following properties:

Attributes in the Joiner operator contain the following properties:

To use a Joiner operator in a mapping:

  1. Drag and drop the Joiner operator onto the Mapping Editor canvas.

  2. Connect an output group from the first source to the Joiner input group.

    The output attributes are created with data types matching the corresponding input data types.

  3. Connect a group from the second source operator to the INGRP2 group of the Joiner operator.

    Figure 8-8 Joiner Operator in a Mapping

    Surrounding text describes Figure 8-8 .
  4. Right-click the Joiner operator header and select Operator Properties.

    The Joiner properties window displays.

  5. Enter a join condition in the Join Condition field or click to define an expression using the Expression Builder.

  6. Close the Joiner property window.

Joiner Restrictions

The join condition expression cannot contain aggregation functions, such as SUM. Compile errors result when deploying the generated code for the mapping. A Joiner can have an unlimited number of input groups but only one output group.

The order of input groups in a joiner is used as the join order. The major difference between ANSI join and an Oracle join is that ANSI join must clearly specify join order. An Oracle join does not require it.

The filter condition is applied after join. For example, consider the following join:

Input1.c --- + 
Input2.c --- +---> Joiner 
Input3.c --- +    

with the following conditions:

  • Condition 1: Input1.c (+) = Input2.c (+)

  • Condition 2: Input2.c = Input3.c

  • Condition 3: Input1.c is null

The first two conditions are true joins while the third is a filter condition. If ANSI code is to be generated, Warehouse Builder interprets the statement as

select ... 
from Input1 full outer join Input2 on (Input1.c = Input2.c) 
join Input3 on (Input2.c = Input3.c) 
where Input1.c is not null; 

Specifying a Full Outer Join

If your target warehouse is based on Oracle9i or a later version, the Warehouse Builder joiner also supports the full outer join. To specify a full outer join condition, you must place the (+) sign on both sides of a relational operator. For example,

T1.A (+) = T2.B (+)

The results of the full outer join are as follows:

  • Rows from sources T1 and T2 that satisfy the condition T1.A = T2.B.

  • Rows from source T1 that do not satisfy the condition. Columns corresponding with T2 are populated with nulls.

  • Rows from source T2 that do not satisfy the condition. Columns corresponding with T1 are populated with nulls.

    Note:

    The relational operator is not restricted to equality. You can also use other operators such as, >, <, !=, >=, <= .

When using the Oracle SQL syntax for partial outer join such as T1.A = T2.B (+), if you place a (+) sign on both sides of the relational operator, it is invalid Oracle SQL syntax. However, Warehouse Builder translates any condition with the double (+) sign into ANSI SQL syntax. For example,

SELECT ...
FROM T1 FULL OUTER JOIN T2 ON (T1.A = T2.B);
  • When using full outer join, keep in mind the following:

  • If you specify a full outer join condition for a non-Oracle9i target system type, you will receive a validation error and the code will not be generated.

  • The ANSI join syntax is generated only if you specify a full outer join condition in the joiner. Otherwise, the following Oracle proprietary join syntax is generated:

    SELECT ...
    FROM T1, T2
    WHERE T1.A = T2.B;
    
  • The input group order is used as the ANSI join order. If you are joining more than two input groups, the order of the input groups determines the join order. Unlike Oracle's proprietary join syntax, the ANSI join syntax requires you to explicitly specify the join order.

  • When you create a joiner, you must order the input groups exactly in the order you want to join them. For example, if you create three input groups in the order T1, T2, T3 and the join condition is

    T1.A (+) = T2.A (+) and T2.A = T3.A
    

    Warehouse Builder generates the following:

SELECT ...
      FROM T1 FULL OUTER JOIN T2 ON (T1.A=T2.A) 
                         JOIN T3 ON (T2.A=T3.A);

If you create input groups in another order, such as T1, T3, T2. Warehouse Builder will generate the following:

SELECT ...
      FROM T1 JOIN T3 ON (1=1) 
              JOIN T2 ON (T1.A=T2.A and T2.A=T3.A);

When T1 and T3 are joined, there is no join condition specified. Warehouse Builder fills in a condition 1=1 (essentially a boolean true) and the two conditions you specified are used to join T2.

  • You can specify both full outer join and join conditions in the same joiner. However, if both conditions are specified for the same sources, the stronger join type is used for generating code. For example, if you specify:

    T1.A(+) = T2.A(+) and T1.B = T2.B
    

    Warehouse Builder will generate a join statement instead of a full outer join because T1.B = T2.B is stronger than the full outer join condition between T1 and T2.

  • You cannot specify a full outer join and partial outer join condition in the same joiner. If you specify a full outer join, then you cannot specify a partial outer join anywhere in the join condition. For example, T1.A (+) = T2.A (+) amd T2.B = T3.B (+) will cause validation errors and code will not be generated.

Creating Full Outer Join Conditions

In an equijoin, key values from the two tables must match. In a full outer join, key values are matched and nulls are created in the resulting table for key values that cannot be matched. A left or a right outer join retains all rows in the specified table.

In Oracle8i, you create an outer join in SQL using the join condition variable (+):

SELECT ...
FROM A, B
WHERE A.key = B.key (+); 

This example is a left outer join. Rows from table A are included in the joined result even though no rows from table B match them. To create a full outer join in Oracle8i, you must use multiple SQL statements.

The Expression Builder allows the following syntax for a full outer join:

TABLE1.COL1 (+) = TABLE2.COL2 (+) 

This structure is not supported by Oracle8i. Oracle Database is ANSI SQL 1999 compliant. The ANSI SQL 1999 standard includes a solution syntax for performing full outer joins. The code generator translates the preceding expression into an ANSI SQL 1999 full outer join statement, similar to:

SELECT ... 
FROM table1 FULL OUTER JOIN table2 ON (table1.col1 = table2.col2) 

Because the full outer join statement complies to ANSI SQL 1999, it is only valid if the generated code is deployed to an Oracle Database database. Specifying a full outer join to an Oracle8i database results in a validation error.

A full outer join and a partial outer join can be used together in a single SQL statement, but it must in an AND or an AND/OR condition. If a full outer join and partial outer join are used in the OR condition, an unexpected AND condition will result. For example,

SELECT ... 
FROM table1 FULL OUTER JOIN table2 ON (A = B or C = D) 

is evaluated by Oracle Server as A (+) = B (+) AND C = D.

To use a full outer join in a mapping:

  1. Follow steps one through four for adding a Joiner operator.

  2. Enter a full outer join statement in the Join Condition field or click to define an expression using the Expression Builder.

  3. Close the Joiner operator property window.

Key Lookup Operator

Use the Key Lookup operator to lookup data from a table, view, cube, or dimension.

Surrounding text describes key_lookup.gif.

For example, you can use the Key Lookup operator when you define a mapping that loads a cube and when you define surrogate keys on the dimension. In this example, you create a Key Lookup operator that looks up the surrogate key in the dimension table and returns the corresponding original record to form the foreign key relationship.

The table, view, cube, or dimension is bound to the Key Lookup operator. You can have multiple Key Lookup operators in the same mapping.

The key that you look up can be any unique value. It does not need to be a primary or unique key, as defined in an RDBMS. The Key Lookup operator reads data from a lookup table using the key input you supply and finds the matching row. This operator returns a row for each input key.

The output of the Key Lookup operator corresponds to the columns in the lookup object. If multiple rows in the lookup table match the key inputs, the cardinality of the output differs from the input. This produces results inconsistent with the data flowing into the target operator and generates an error at runtime. To ensure that only a single lookup row is found for each key input row, use keys in your match condition.

You can use inbound reconciliation on Key Lookup outputs. Outbound reconciliation is disabled. See "Reconciling Operators and Repository Objects" beginning for more information.

Each output attribute for the key lookup has a property called DEFAULT VALUE. The DEFAULT VALUE property is used instead of NULL in the outgoing row set if no value is found in the lookup table for an input value. The generated code uses the NVL function. The Key Lookup always results in an outer-join statement.

When you validate this operator:

To use a Key Lookup operator in a mapping:

  1. Drop a Key Lookup operator onto the Mapping Editor canvas.

    Warehouse Builder displays the Add Mapping Key Lookup dialog.

  2. Use the Add Mapping Key Lookup dialog to select one or more tables.

    Warehouse Builder adds to the mapping canvas one Key Lookup operator for each table you select. For more information on using the Add Mapping Key Lookup dialog, see "Adding Bindable Operators".

  3. Connect the source attributes to the input group of the Key Lookup operator.

  4. Right-click the Key Lookup operator header and select Operator Properties.

    Warehouse Builder displays the Key Lookup Properties window.

  5. Click the field to the right of the Lookup Condition and click the ... button.

    The Lookup Condition dialog displays as shown in Figure 8-9.

    Figure 8-9 Lookup Condition Dialog

    Surrounding text describes Figure 8-9 .
  6. Select the lookup entity attribute from the Lookup Table Column or Key drop-down list shown in Figure 8-10.

    Choose attributes to compare to the selected lookup table column.

    For a non-composite key, select an input attribute or a key from the drop-down list. Click Add to List. The column or input pairs are added to the table at the bottom of the main dialog.

    For a composite key, click Add to List. The Match Key Columns to Input for Key dialog displays. Select the key input attributes from the Input Attribute drop-down list. Click OK.

    Figure 8-10 Match Key Columns to Input for Key Dialog

    Surrounding text describes Figure 8-10 .
  7. Click OK to close the Lookup Condition dialog.

Mapping Advanced Queue Operator

Mapping Advanced Queue (AQ) operators are used to propagate messages in queues from source to target systems.

Surrounding text describes AdvancedQueIcon.gif.

In Warehouse Builder, you can use AQs as sources or targets. You can map an AQ source object to a target table or a staging table and then deploy the mapping to your target database. When you run the AQ mapping, the changes are propagated from the AQ to the target table in your warehouse. After a mapping runs successfully, subsequent invocations of the same mapping will not see the messages that have already been processed. If the mapping fails, no messages will be dequeued from the source AQs.

Warehouse Builder also enables you to use AQs as targets in your warehouse. Source AQs in your repository may represent a central AQ that integrates data from different messaging systems or applications. Warehouse Builder enables you to map this source AQ to a target AQ and propagate the messages from one type of messaging system to another.

For more information, see:

Creating AQ Mappings

To use a Mapping Advanced Queue operator in a mapping:

  1. Drag and drop the Mapping Advanced Queue operator icon onto the mapping canvas.

    The Add Mapping Advanced Queue dialog displays.

  2. You can choose to import new AQ definitions into the repository or select a previously imported AQ from the repository.

    Each AQ bound to an AQ operator in a mapping must belong to the same warehouse module as the mapping. If not, you will receive an error when you validate the mapping.

    For more information on how to use the Add Mapping Advanced Queue dialog, see "Adding Bindable Operators".

  3. Click OK.

  4. Connect the output attribute of a Mapping AQ operator to the input group of a target operator.

Example of an AQ in a Mapping

A local company assigns a customer service representative to every customer by region. As the customer base increases, the company hires new service representatives to manage new and existing customers. The new service representative assignments to customers are tracked by using queues. You create an AQ with messages capturing the changes made to the Customer table.

You can import this AQ into Warehouse Builder and use it as a data source in a mapping. Connect the AQ source to the Customer target table on the mapping editor.Warehouse Builder generates PL/SQL scripts for this mapping, which you can deploy to your target warehouse. When you run the mapping, the messages are dequeued and the new assignments of the service representatives in the source Customer table are propagated to the Customer table in the target database.

If you register each mapping in Warehouse Builder as a separate subscriber of each AQ, then different mappings referencing the same AQ do not interfere with each other. Each mapping sees all the messages in the AQ published to all its subscribers.

Reconciling Advanced Queues

You can perform an inbound reconciliation on an AQ operator to update it with any changes in the repository definition of the AQ object to which it is bound. For more information, see "Inbound Reconciliation". Warehouse Builder does not currently enable you to outbound reconcile an AQ operator.

Advanced Queue Operator Properties

You can configure the following properties for a Mapping AQ operator used in a mapping:

  • From the Mapping Editor, right-click the AQ operator and select Operator Properties.

    Warehouse Builder displays the Mapping Advanced Queue Properties dialog containing the Bound Name property. This name is used to identify the AQ during code generation.

  • From the Mapping Editor, right-click the AQ operator and select Edit.

    Warehouse Builder displays the Mapping Advanced Queue Editor containing three tabs: General, Groups, Input/Output.

General

Rename the Mapping AQ operator by highlighting the previous name and typing over it. Type an optional description for the Mapping AQ operator.

Groups

Rename the Mapping AQ operator group by highlighting the previous name and typing over it. Each AQ operator has exactly one INOUT group. This is a read-only field. Type an optional description for the Mapping AQ operator group.

Input/Output

You cannot add more attributes to the INOUT group of an AQ operator. You can view and edit the attribute name, datatype, length, precision, scale, and optional description.

AQ Prerequisites for Mapping Execution

AQs deployed from Warehouse Builder are proxy AQs that represent the AQ source on a remote system. Because Oracle does support remote dequeuing of messages, you need to first deploy an agent of the source AQ in your local schema and register the locally deployed proxy AQ as a subscriber to the AQ on the source system. For the Warehouse Builder AQ mappings to run correctly, you must follow these steps in your source and warehouse systems:

  • Grant the following privileges to your AQ source system user:

    execute on dbms_aq
    execute on dbms_aqadm
    execute on aq$_agent
     
    
  • Grant the following privileges to your warehouse system user:

    execute on dbms_aq
    execute on dbms_aqadm
    execute on aq$_agent
    
  • From your AQ source system, create a database link to the AQ target system.

    The target user can now deploy the target proxy AQ to the target instance.

  • When you configure AQs for deployment, make sure that both the AQ object and the mapping that has AQ operators bound to this AQ object are deployed in the same location.

  • Before deploying a mapping, you must successfully deploy the temporary tables associated with the AQ object to which the Mapping AQ operator is bound.

  • After deploying the proxy AQ to the target system, you must register the locally deployed proxy AQ as a subscriber to the AQ on the source system. See "Registering AQ as a Subscriber".

  • Concurrent executions of the same mapping using the AQ source operators is not supported.

  • If an AQ is a target in a mapping, make sure that this target AQ has at least one subscriber before the mapping is executed. Otherwise, you will receive a runtime error during mapping execution.

  • The maximum number of subscribers to an AQ cannot exceed 1024. For an AQ q1, the number of mappings that have a source AQ staging component bound to q1 should be less than 1024.

  • The upgrade of a mapping which contains an AQ may result in the loss of data contained in the AQ before the upgrade. This is because the subscriber of the proxy AQ will be dropped and re-created during the upgrade.

Registering AQ as a Subscriber

If an AQ is defined in your source system, you need to follow these manual steps to register the deployed proxy AQ as a subscriber to the source AQ and schedule propagation of the AQ to your target warehouses:

  1. Create a DB link from your source system to your data warehouse.

  2. Register the AQ as a subscriber to all messages.

  3. Schedule propagation of AQ from source to the target system.

The following is an example of how you can register the AQ as a subscriber and schedule its propagation to the target system.

declare
subscriber sys.aq$_agent;
begin
subscriber := sys.aq$_agent('subscribername','schema.queue2',0); --
schema.queue2 refers to the queue that is subscribing
dbms_aqadm.add_subscribe(queue_name=>'QUEUE1',subscriber=>subscriber,rule=>'
',transformation=>''); -- QUEUE1 refers to the queue that 'queue2 is
subscribing to
dbms_aqadm.schedule_propagation('SCHEMA.QUEUE1',NULL,SYSDATE,'','','60'); --
SCHEMA.QUEUE1 are the schema and queuename of queue1. This command starts
the message propagation from queue1 to queue2 with a latency of 60 seconds
end;

Mapping Flat File Operator

You can use a Mapping Flat File operator as either a source or target.

Surrounding text describes flatfile.gif.

However, the two are mutually exclusive within the same mapping. There are differences in code generation languages for flat file sources and targets. Subsequently, mappings can contain a mix of flat files, relational objects, and transformations, but with the restrictions discussed further in this section.

Flat File Source Operators

As a source, the Mapping Flat File operator acts as the row set generator that reads from a flat file using the SQL*Loader utility. You cannot use a flat file source operator if you are mapping to a flat file target or to an external table. When you design a mapping with a flat file source, you can use the following operators:

When you use a flat file as a source in a mapping, remember to create a connector from the flat file source to the relational target for the mapping to deploy successfully.

Flat File Target Operators

A mapping with a flat file target generates a PL/SQL package that loads data into a flat file instead of loading data into rows in a table. You can either use an existing flat file, or identify a new flat file as a target during design and define it using outbound reconciliation. See "Outbound Reconciliation".

Note:

A mapping can contain a maximum of 50 flat file target operators at one time.

When you use a flat file as a target, understanding both flat files and external tables will help you determine which feature to use. If you are loading large volumes of data, loading to a flat file enables you to use the DIRECT PATH SQL*Loader option, which results in better performance. If you are not loading large volumes of data, you can benefit from many of the relational transformations available in the external table feature. Refer to "External Tables versus Flat File Operators" for more information.

If you use a multiple-record-type flat file as a target, you can only map to one of the record types. If you want to load all of the record types in the flat file from the same source, you can either drop the same flat file into the mapping as a target again and map to a different record type, or create a separate mapping for each record type you want to load.

You have the following options for Mapping Flat File operators:

Using Previously Imported Flat Files

This scenario describes using a flat file object that has been previously imported and sampled. You can find instructions for importing and sampling flat file objects in Chapter 4, "Importing Data Definitions".

To use a previously imported flat file as a source or a target:

  1. Drag and drop a Mapping Flat File operator onto your mapping editor canvas.

    The Add File dialog displays.

  2. Click Select from existing repository file and bind.

  3. Highlight the desired flat file object.

  4. Click OK.

  5. Prior to deployment, you must follow the steps for "Configuring Flat File Operators" and for "Configuring Mappings Reference".

Importing and Binding New Flat Files into Your Mapping

In this option, you use a flat file as either a source or target, but you have not yet imported or sampled the flat file. You import the flat file as you design your mapping.

To import and bind a new flat file into your mapping:

  1. Drop a Mapping Flat File onto your mapping canvas.

    The Add File dialog displays.

  2. Click Import file into repository and bind.

  3. Highlight the module into which you want to import the flat file.

  4. Click OK.

    The Import File Wizard Welcome page displays. For more information on importing Flat Files, see "Using the Import Metadata Wizard Flat Files".

  5. At the Welcome page, click Next.

    The Import File Wizard Object Selection page displays.

  6. Navigate to the location and select the flat file.

  7. Click Next.

    The Import File Wizard Summary and Import page displays. A red ball next to the file indicates that there is no metadata information available about the file structure.

    • If the flat file has the same structure as another file you have already imported and sampled, click the Same As field and select the identical file from the drop list.

    • If there are no previously imported or sampled files with this file structure, click Sample and follow the instructions for "Using the Flat File Sample Wizard".

  8. Click Finish.

    The Mapping Flat File operator appears on the mapping canvas.

  9. Prior to deployment, you must follow the steps for "Configuring Flat File Operators" and for "Configuring Mappings Reference".

Defining New Flat File Sources or Targets in Mappings

As you build your mapping, you can create a new flat file object by selecting the Create unbound mapping file with no attributes option. Using this method, the only option is to create a comma-delimited, single-record-type flat file operator. You can leave the flat file unbound, or you can outbound reconcile to the repository.

Note:

You can only perform outbound reconciliation if the flat file is new to that repository. For more information on reconciliation, refer to "Reconciling Operators and Repository Objects".

To use a new unbound Flat File operator in a mapping:

  1. Drag and drop a Mapping Flat File operator onto the mapping canvas.

    The Add File dialog displays.

  2. Click Create unbound mapping File with no attributes.

  3. Enter the new file name.

  4. Click OK.

    The Mapping Flat File operator with no attributes appears in your mapping.

  5. Define attributes for the new flat file operator by right-clicking on the operator and choosing Edit. Enter the attributes manually on the Input/Output tab, or by auto-mapping them from another operator.

    For instructions on defining attributes, refer to "Editing Operators".

  6. Right-click the flat file operator and select Reconcile Outbound to create a new repository flat file.

  7. Select the flat file module in which you want to create the new flat file.

  8. Click OK.

    A new comma-delimited flat file is created in your repository.

  9. Prior to deployment, you must follow the steps for "Configuring Flat File Operators" and for "Configuring Mappings Reference".

Extracting a Master-Detail Structure from Flat Files

If you are extracting data from a multiple-record-type file with a master-detail structure and mapping to tables, add a Mapping Sequence operator to the mapping to retain the relationship between the master and detail records through a surrogate primary key or foreign key relationship. A master-detail file structure is one where a master record is followed by its detail records. In Example 8-1, records beginning with "E" are master records with Employee information and records beginning with "P" are detail records with Payroll information for the corresponding employee.

Example 8-1 A Multiple-Record-Type Flat File with a Master-Detail Structure

E 003715 4 153 09061987 014000000 "IRENE HIRSH" 1 08500
P 01152000 01162000 00101 000500000 000700000
P 02152000 02162000 00102 000300000 000800000
E 003941 2 165 03111959 016700000 "ANNE FAHEY" 1 09900 
P 03152000 03162000 00107 000300000 001000000
E 001939 2 265 09281988 021300000 "EMILY WELLMET" 1 07700
P 01152000 01162000 00108 000300000 001000000
P 02152000 02162000 00109 000300000 001000000

In Example 8-1, the relationship between the master and detail records is inherent only in the physical record order: payroll records correspond to the employee record they follow. However, if this is the only means of relating detail records to their masters, this relationship is lost when Warehouse Builder loads each record into its target table.

Maintaining Relationships Between Master and Detail Records

You can maintain the relationship between master and detail records if both types of records share a common field. If Example 8-1 contains a field Employee ID in both Employee and Payroll records, you can use it as the primary key for the Employee table and as the foreign key in the Payroll table, thus associating Payroll records to the correct Employee record.

However, if your file does not have a common field that can be used to join master and detail records, you must add a sequence column to both the master and detail targets (see Table 8-2 and Table 8-3) to maintain the relationship between the master and detail records. Use the Mapping Sequence operator to generate this additional value.

Table 8-2 represents the target table containing the master records from the file in Example 8-1. The target table for the master records in this case contains employee information. Columns E1-E10 contain data extracted from the flat file. Column E11 is the additional column added to store the master sequence number. Notice that the number increments by one for each employee.

Table 8-2 Target Table Containing Master Records

E1 E2 E3 E4 E5 E6 E7 E8 E9 E10 E11

E

003715

4

153

09061987

014000000

"IRENE

HIRSH"

1

08500

1

E

003941

2

165

03111959

016700000

"ANNE

FAHEY"

1

09900

2

E

001939

2

265

09281988

021300000

"EMILY

WELSH"

1

07700

3


Table 8-3 represents the target table containing the detail records from the file in Example 8-1. The target table for the detail records in this case contains payroll information, with one or more payroll records for each employee. Columns P1-P6 contain data extracted from the flat file. Column P7 is the additional column added to store the detail sequence number. Notice that the number for each payroll record matches the corresponding employee record in Table 8-2.

Table 8-3 Target Table Containing Detail Records

P1 P2 P3 P4 P5 P6 P7

P

01152000

01162000

00101

000500000

000700000

1

P

02152000

02162000

00102

000300000

000800000

1

P

03152000

03162000

00107

000300000

001000000

2

P

01152000

01162000

00108

000300000

001000000

3

P

02152000

02162000

00109

000300000

001000000

3


Extracting and Loading Master-Detail Records

This section contains instructions on creating a mapping that extracts records from a master-detail flat file and loads those records into two different tables. One target table stores master records and the other target table stores detail records from the flat file. The Mapping Sequence is used to maintain the master-detail relationship between the two tables.

Note:

These instructions are for conventional path loading. For instructions on using direct path loading for master-detail records, see "Direct Path Loading for Performance".

This procedure outlines general steps for building such a mapping. Additional detailed instructions are available:

To extract from a master-detail flat file and maintain master-detail relationships:

  1. Import and sample a flat file source that consists of master and detail records.

    When naming the record types as you sample the file, assign descriptive names to the master and detail records, as shown in Figure 8-11. This makes it easier to identify those records in the future.

    Figure 8-11 shows the Flat File Sample Wizard for a multiple-record-type flat file containing department and employee information. The master record type (for employee records) is called EmployeeMaster, while the detail record type (for payroll information) is called PayrollDetail.

    Figure 8-11 Naming Flat File Master and Detail Record Types

    Surrounding text describes Figure 8-11 .
  2. Drop a Mapping Flat File operator onto the mapping editor canvas and specify the master-detail file from which you want to extract data.

  3. Drop a Mapping Sequence operator onto the mapping canvas.

  4. Drop a Mapping Table operator for the master records onto the mapping canvas.

    You can either select an existing repository table that you created earlier or create a new unbound mapping table with no attributes. You can then map or copy all required fields from the master record of the file operator to the master table operator (creating columns) and perform an outbound reconciliation to define the table later.

    The table must contain all the columns required for the master fields you want to load plus an additional numeric column for loading sequence values, as shown in Figure 8-12.

    Figure 8-12 Adding a Sequence Column to the Master and Detail Target Tables

    Surrounding text describes Figure 8-12 .
  5. Drop a Mapping Table operator for the detail records onto the mapping canvas.

    You can either select an existing repository table that you created earlier or create a new unbound mapping table with no attributes. You can then map or copy all required fields from the master record of the file operator to the master table operator (creating columns) and perform an outbound reconcile to define the table later.

    The table must contain all the columns required for the detail fields you want to load plus an additional numeric column for loading sequence values.

  6. Map all of the necessary flat file master fields to the master table and detail fields to the detail table, as shown in Figure 8-13.

  7. Map the Mapping Sequence NEXTVAL attribute to the additional sequence column in the master table, as shown in Figure 8-13.

  8. Map the Mapping Sequence CURRVAL attribute to the additional sequence column in the detail table, as shown in Figure 8-13.

    Figure 8-13 shows a completed mapping with the flat file master fields mapped to the master target table, the detail fields mapped to the detail target table, and the NEXTVAL and CURRVAL attributes from the Mapping Sequence mapped to the master and detail target tables, respectively.

    Figure 8-13 Completed Mapping from Master-Detail Flat File to Two Target Tables

    Surrounding text describes Figure 8-13 .
  9. Configure the mapping with the following parameters:

    Direct Mode: False

    Errors Allowed: 0

    Row: 1

    Trailing Nullcols: True (for all tables)

Error Handling Suggestions

This section contains error handling recommendations for files with varying numbers of errors.

If your data file almost never contains errors:

  1. Create a mapping with a Sequence operator (see "Mapping Sequence Operator").

  2. Configure a mapping with the following parameters:

    Direct Mode= false

    ROW=1

    ERROR ALLOWED = 0

  3. Generate the code and run an SQL Loader script.

    If the data file has errors, the loading stops when the first error happens.

  4. Fix the data file and run the control file again with the following configuration values:

    CONTINUE_LOAD=TRUE

    SKIP=number of records already loaded

If your data file is likely to contain a moderate number of errors:

  1. Create a primary key (PK) for the master record based on the seq_nextval column.

  2. Create a foreign key (FK) for the detail record based on the seq_currval column which references the master table PK.

    In this case, master records with errors will be rejected with all their detail records. You can recover these records by following these steps.

  3. Delete all failed detail records that have no master records.

  4. Fix the errors in the bad file and reload only those records.

  5. If there are very few errors, you may choose to load the remaining records and manually update the table with correct sequence numbers.

  6. In the log file, you can identify records that failed with errors because those errors violate the integrity constraint. The following is an example of a log file record with errors:

    Record 9: Rejected - Error on table "MASTER_T", column "C3". 
    
    ORA-01722: invalid number 
    Record 10: Rejected - Error on table "DETAIL1_T". 
    
    ORA-02291: integrity constraint (SCOTT.FK_SEQ) violated - parent key not found
    Record 11: Rejected - Error on table "DETAIL1_T". 
    
    ORA-02291: integrity constraint (SCOTT.FK_SEQ) violated - parent key not found 
    Record 21: Rejected - Error on table "DETAIL2_T". 
    
    ORA-02291: invalid number  
    

If your data file always contains many errors:

  1. Load all records without using the Mapping Sequence operator.

    Load the records into independent tables. You can load the data in Direct Mode, with the following parameters that increase loading speed:

    ROW>1

    ERRORS ALLOWED=MAX

  2. Correct all rejected records.

  3. Reload the file again with a Sequence operator (see "Mapping Sequence Operator").

Direct Path Loading for Performance

If you are using a master-detail flat file where the master record has a unique field (or if the concatenation of several fields can result in a unique identifier), you can use Direct Path Load as an option for faster loading.

For direct path loading, the record number (RECNUM) of each record is stored in the master and detail tables. A post-load procedure uses the RECNUM to update each detail row with the unique identifier of the corresponding master row.

This procedure outlines general steps for building such a mapping. Additional detailed instructions are available:

To extract from a master-detail flat file using direct path load to maintain master-detail relationships:

  1. Import and sample a flat file source that consists of master and detail records.

    When naming the record types as you sample the file, assign descriptive names to the master and detail records, as shown in Figure 8-11. This will make it easier to identify those records in the future.

  2. Drop a Mapping Flat File operator onto the mapping canvas and specify the master-detail file from which you want to extract data.

  3. Drop a Data Generator and a Constant operator onto the mapping canvas.

  4. Drop a Mapping Table operator for the master records onto the mapping canvas.

    You can either select an existing repository table that you created earlier, or create a new unbound mapping table with no attributes and perform an outbound reconcile to define the table later.

    The table must contain all the columns required for the master fields you plan to load plus an additional numeric column for loading the RECNUM value.

  5. Drop a Mapping Table for the detail records onto the mapping canvas.

    You can either select an existing repository table that you created earlier, or create a new unbound mapping table with no attributes and perform an outbound reconcile to define the table later.

    The table must contain all the columns required for the detail fields you plan to load plus an additional numeric column for loading a RECNUM value, and a column that will be updated with the unique identifier of the corresponding master table row.

  6. Map all of the necessary flat file master fields to the master table and detail fields to the detail table, as shown in Figure 8-15.

  7. Map the Data Generator operator's RECNUM attribute to the RECNUM columns in the master and detail tables, as shown in Figure 8-15.

  8. Add a constant attribute in the Constant operator.

    If the master row unique identifier column is of a CHAR datatype, make the constant attribute a CHAR type with the expression '*'.

    If the master row unique identifier column is a number, make the constant attribute a NUMBER with the expression '0'. Figure 8-14 shows the expression property of the constant attribute set to '0'. This constant marks all data rows as Òjust loaded.Ó

    Figure 8-14 Constant Operator Properties

    Surrounding text describes Figure 8-14 .
  9. Map the constant attribute from the Constant operator to the detail table column that will later store the unique identifier for the corresponding master table record.

    Figure 8-15 shows a completed mapping with the flat file's master fields mapped to the master target table, the detail fields mapped to the detail target table, the RECNUM attributes from the Data Generator operator mapped to the master and detail target tables, respectively, and the constant attribute mapped to the detail target table.

    Figure 8-15 Completed Mapping from Master-Detail Flat File with a Direct Path Load

    Surrounding text describes Figure 8-15 .
  10. Configure the mapping with the following parameters:

    Direct Mode: True

    Errors Allowed: 0

    Trailing Nullcols: True (for each table)

  11. After you validate the mapping and generate the SQL*Loader script, create a post-update PL/SQL procedure and add it to the Warehouse Builder library.

  12. Run the SQL*Loader script.

  13. Execute an UPDATE SQL statement by running a PL/SQL post-update procedure or manually executing a script.

The following is an example of the generated SQL*Loader control file script:

OPTIONS ( DIRECT=TRUE,PARALLEL=FALSE, ERRORS=0, BINDSIZE=50000, ROWS=200, READSIZE=65536)
LOAD DATA
CHARACTERSET WE8MSWIN1252
                INFILE 'g:\FFAS\DMR2.dat'
                READBUFFERS 4
                INTO TABLE "MATER_TABLE"
                APPEND
                REENABLE DISABLED_CONSTRAINTS
                WHEN 
                "REC_TYPE"='P'
                FIELDS
                TERMINATED BY ','
                OPTIONALLY ENCLOSED BY '"'
                TRAILING NULLCOLS
    
                (
                "REC_TYPE" POSITION (1) CHAR ,
                "EMP_ID" CHAR ,
                "ENAME" CHAR ,
                "REC_NUM" RECNUM
                )
  
INTO TABLE "DETAIL_TABLE"
                APPEND
                REENABLE DISABLED_CONSTRAINTS
                WHEN 
                "REC_TYPE"='E'
                FIELDS
                TERMINATED BY ','
                OPTIONALLY ENCLOSED BY '"'
                TRAILING NULLCOLS
        (
                "REC_TYPE" POSITION (1) CHAR ,
                "C1" CHAR ,
                "C2" CHAR ,
                "C3" CHAR ,
                "EMP_ID" CONSTANT '*',
         "REC_NUM" RECNUM

The following is an example of the post-update PL/SQL procedure:

create or replace procedure wb_md_post_update( 
     master_table varchar2 
    ,master_recnum_column varchar2 
    ,master_unique_column varchar2 
    ,detail_table varchar2 
    ,detail_recnum_column varchar2 
    ,detail_masterunique_column varchar2 
    ,detail_just_load_condition varchar2) 
  IS 
     v_SqlStmt VARCHAR2(1000); 
  BEGIN 
     v_SqlStmt := 'UPDATE '||detail_table||' l '|| 
                  ' SET l.'||detail_masterunique_column||' = (select i.'||master_unique_column|| 
                  ' from '||master_table||' i '|| 
                  ' where i.'||master_recnum_column||' IN '|| 
                  ' (select max(ii.'||master_recnum_column||') '|| 
                  ' from '||master_table||' ii '|| 
                  ' where ii.'||master_recnum_column||' < l.'||detail_recnum_column||') '|| 
                  ' ) '|| 
                  ' WHERE l.'||detail_masterunique_column||' = '||''''||detail_just_load_condition||''''; 
     dbms_output.put_line(v_sqlStmt); 
     EXECUTE IMMEDIATE  v_SqlStmt; 
  END; 
  /

Subsequent Operations

After the initial loading of the master and detail tables, you can use the loaded sequence values to further transform, update, or merge master table data with detail table data. For example, if your master records have a column that acts as a unique identifier (such as an Employee ID), and you want to use it as the key to join master and detail rows (instead of the sequence field you added for that purpose), you can update the detail table(s) to use this unique column. You can then drop the sequence column you created for the initial load. Operators such as the Aggregator, Filter, or Match and Merge operator can help you with these subsequent transformations.

Mapping Input Parameter Operator

You can introduce information external to Warehouse Builder as input into a mapping using a Mapping Input Parameter. For example, you can use an Input Parameter operator to pass SYSDATE to a mapping that loads data to a staging area. Use the same Input Parameter to pass the timestamp to another mapping that loads the data to a target.

When you a generate mapping, Warehouse Builder creates PL/SQL package. Mapping input parameters become part of the signature of the main procedure in the package.

The Mapping Input Parameter has a cardinality of one. It creates a single row set that can be combined with another row set as input to the next operator.

The names of the input attributes become the names of the mapping output parameters. The parameters can be used by connecting the attributes of the Mapping Input Parameters operator within the mapping editor. You can have only one Mapping Input Parameter in a mapping.

The Mapping Input Parameter contains the following properties:

param1 IN VARCHAR2 DEFAULT 'HELLO'

To use a Mapping Input Parameter operator in a mapping:

  1. Drag and drop a Mapping Input operator onto the Mapping Editor canvas.

  2. Right-click the Mapping Input operator and select Edit.

  3. Select the Output tab and click Add to add output attributes.

    You can rename the attributes and define the data type and other attribute properties.

  4. Connect the Input Parameter operator MAP_INPUT attribute to a group in the target operator as shown in Figure 8-16.

    Figure 8-16 Mapping Editor Showing A Mapping Input Parameter

    Surrounding text describes Figure 8-16 .

Mapping Output Parameter Operator

Use the Mapping Output Parameter operator to send values out of a mapping to applications external to Warehouse Builder. When you a generate mapping, Warehouse Builder creates a PL/SQL package. Mapping Output Parameters become part of the signature of the main procedure in the package.

The Mapping Output Parameter has only one input group. You can have only one Mapping Output Parameter in a map. Only attributes that are not associated with a row set can be mapped into a Mapping Output Parameter. For example, constant, input parameter, output from a pre-mapping process, or output from a post process can all contain attributes that are not associated with a row set. A Mapping Output Parameter is not valid for a SQL*Loader mapping.

The default value for the Mapping Output Parameter appears in the DEFAULT clause following the function parameter declarations in the generated PL/SQL package. For example, if a mapping parameter named param1 with data type VARCHAR2 is defined with a default value of 'HELLO', the generated main function in the PL/SQL package appears as:

param1 OUT VARCHAR2 DEFAULT 'HELLO'

If a Mapping Output Parameter named param1 has data type VARCHAR2, the generated main function in the PL/SQL package appears as:

param1 OUT VARCHAR2

The Mapping Output Parameter contains the following properties:

To use a Mapping Output Parameter operator in a mapping:

  1. Drop a Mapping Output operator onto the Mapping Editor canvas.

  2. Right-click the Mapping Output operator and select Edit.

  3. Select the Input Attributes tab and click Add to add input attributes.

    You can rename the attributes and define the data type and other attribute properties.

    See Figure 8-17 for an example of an Mapping Output Parameter in a mapping.

    Figure 8-17 Mapping Editor Showing An Output Parameter Operator

    Surrounding text describes Figure 8-17 .

Mapping Sequence Operator

A Mapping Sequence operator generates sequential numbers that increment for each row. For example, you can use the Sequence operator to create surrogate keys while loading data into a dimension table.

You can connect a Mapping Sequence to a target operator input or to the inputs of other types of operators. You can combine the sequence outputs with outputs from other operators.

This operator contains an output group containing the following output attributes:

You can bind and reconcile Mapping Sequences to a repository sequence in one of the modules. The repository sequence must be generated and deployed before the mapping containing the Mapping Sequence is deployed to avoid errors in the generated code package. See "Adding Bindable Operators" for more information.

Generate mappings with sequences using Row Based mode. Sequences are incremented even if rows are not selected. If you want a sequence to start from the last number, then do not run your SQL package in Set Based or in Set Based With Failover operating modes. See "Runtime Parameters Reference" for more information on configuring mode settings.

The Mapping Sequence contains the following property:

To use a Mapping Sequence operator in a mapping:

  1. Drag and drop the Mapping Sequence operator onto the Mapping Editor canvas.

    Warehouse Builder displays the Add Mapping Sequence dialog.

  2. Use the Add Mapping Sequence dialog to create or select a sequence. For more information on these options, see "Adding Bindable Operators".

  3. Connect the sequence to a target attribute.

Match-Merge Operator

The Match-Merge operator is a data quality operator that you can use to first match and then merge data.

When you match records, you determine through business rules which records in a table refer to the same data. When you merge records, you consolidate into a single record the data from the matched records.

This section includes information and examples on how to use the Match-Merge operator in a mapping. The Match-Merge operator together with the Name-Address operator support householding, the process of identifying unique households in name and address data.

Example: Matching and Merging Customer Data

Consider how you could utilize the Match-Merge operator to manage a customer mailing list. Use matching to find records that refer to the same person in a table of customer data containing 10,000 rows. For example, you can define a match rule that screens records that have similar first and last names. Through matching you may discover that 5 rows refer to the same person. You can merge those records into one new record. For example, you can create a merge rule to retain the values from the one of the five matched records with the longest address. The newly merged table now contains one record for each customer.

Table 8-4 shows records that refer to the same person prior to using the Match-Merge operator.

Table 8-4 Sample Records

Row FirstName LastName SSN Address Unit Zip

1

Jane

Doe

NULL

123 Main Street

NULL

22222

2

Jane

Doe

111111111

NULL

NULL

22222

3

J.

Doe

NULL

123 Main Street

Apt 4

22222

4

NULL

Smith

111111111

123 Main Street

Apt 4

22222

5

Jane

Smith-Doe

111111111

NULL

NULL

22222


Table 8-5 shows the single record for Jane Doe after using the Match-Merge operator. Notice that the new record retrieves data from different rows in the sample.

Table 8-5 Match-Merge Results

FirstName LastName SSN Address Unit Zip

Jane

Doe

111111111

123 Main Street

Apt 4

22222


Designing Mappings with a Match-Merge Operator

Figure 8-18 shows a mapping you can design using a Match-Merge operator. Notice that the Match-Merge operator is preceded by a Name-Address operator, NAMEADDR, and a staging table, CLN_CUSTOMERS. You can design your mapping with or without a Name-Address operator. Preceding the Match-Merge operator with a Name-Address operator is desirable when you want to ensure your data is clean and standardized before launching time consuming match and merge operations.

Figure 8-18 Match-Merge Operator in a Mapping

Surrounding text describes Figure 8-18 .

Whether you include a Name-Address operator or not, be aware of the following considerations as you design your mapping:

  • PL/SQL output: The Match-Merge operator can generate two outputs, both PL/SQL outputs only. The MERGE group includes the merged data. The XREF group is an optional group you can design to document the merge process.

  • Row based operating mode: When the Match-Merge operator matches records, it compares each row with the subsequent row in the source and generates row based code only. These mappings, therefore, can only run in row based mode.

  • SQL based operators before Match-Merge: The Match-Merge operator generates only PL/SQL outputs. If you want to include operators that generate SQL code only, you must design the mapping such that they precede the Match-Merge operator. For example, operators such as the Join, Key Lookup, and Set operators must precede the Match-Merge operator. A mapping designed with operators that generate set based code after a Match-Merge operator is invalid and Warehouse Builder does not generate code for such mappings.

  • SQL input: With one specific exception, the Match-Merge operator requires SQL input. If you want to precede a Match-Merge with an operator that generates only PL/SQL output such as the Name-Address operator, you must first load the data to a staging table.

  • Refining Data from Match-Merge operators: To achieve greater data refinement, map the XREF output from one Match-Merge operator into another Match-Merge operator. This scenario is the one exception to the SQL input rule for Match-Merge operators. With additional design elements, the second Match-Merge operator accepts PL/SQL. For more information, see "Refining Data from Match-Merge Operators".

Using the Match-Merge Operator

You have the following options for using a match-merge operator:

  • Define a new match-merge operator: Drag match-merge operator from the Toolbox onto the mapping. The Mapping Editor launches a wizard.

  • Edit an existing match-merge operator: Right-click the operator and select Edit. The Mapping Editor opens the Match-Merge Editor.

Whether you are using the operator wizard or the Operator Editor, complete the following pages:

The pages are listed in a columnar table that reads down the columns left to right to conserve space.

  • General
  • Input Attributes

  • Groups


  • Input Connections



General

Use the General page to specify a name and optional description for the operator. By default, the wizard names the match-merge operator ÒMM.Ó

Groups

By definition, the Match-Merge operator has one input group and two output groups. You can rename the groups and add optional descriptions, but you cannot add or delete groups in the Match-Merge operator. The default names for the input group is INGRP1. The default names for the output groups are MERGE and XREF.

You assign attributes to the INGRP1 on the Input Connections page and then further edit those attributes on the Input Attributes page. You define attributes for the MERGE group on the Merge Output page. You can optionally assign attributes to the XREF group on the Cross Reference Output page.

Input Connections

Use the Input Connections page to select attributes to copy and map into the operator.

To complete the Input connections page for an operator:

  1. Select complete groups or individual attributes from the left panel.

    To search for a specific attribute or group by name, type the text in Search for and click Go. To find the next match, click Go again.

    Hold the Shift key down to select multiple groups or attributes. If you want to select attributes from different groups, you must first combine the groups with a Joiner or Set operator.

  2. Use the left to right arrow button in the middle of the page to move your selections to the right side of the wizard page.

    You can use the right to left arrow to move groups or attributes from the input connections list. Warehouse Builder removes the selection from the input group and removes the mapping line between the source operator and the current operator.

Input Attributes

Use the Input page to modify the attributes you selected in the Input Connections tab or wizard page.

You can perform the following tasks from the Match-Merge Input Attributes page:

  • Change attribute properties: You can change the attribute name, data type, length, precision and scale.

  • Add an optional description: Type a description for the input attributes.

  • Add attributes: Use the Add button to add input attributes that you did not previously select in the Input Connections tab or wizard page.

Merge Output

Use the Merge Output tab to specify the attributes for the output MERGE group. The MERGE group produces a consolidated record with the attributes you select.

Cross Reference Output

Use the Cross Reference Output page to optionally select attributes for the XREF group. Although the Match-Merge operator creates the XREF group by default, you have the option of adding attributes to the group or leaving it empty.

The XREF group is an optional group you can define to document the merge process. It enables you to create a foreign key relationship between the original data set to the new merged data set. You can send the attributes from the XREF group to a table that records the corresponding source row for each merged row.

Every row from INGRP1 corresponds to a row in the XREF group. To design the XREF group, select original attribute values and merged attributes from Source Attributes on the left. Warehouse Builder displays the merged attributes in Source Attributes with a default prefix of ÒMM_Ó. Use Set Prefix at the bottom left corner of the page to change the prefix.

Match Bins

Use the Match Bins page to limit the set of possible matches to a manageable number. When Warehouse Builder matches the rows, it compares each row with the subsequent row for all rows within the same grouping. This can greatly enhance performance since Warehouse Builder searches for matches only within groupings and not throughout the entire data set.

While you want to define Match Bins that separate records into manageable groupings, you also want to avoid separating records that should be matched. The attribute(s) you select for grouping like data depends on your data. For example, if you have a table of customer address with a million rows, you may want to group the data by partial street name, city name, and zip code.

Ideally, you should attempt to keep the number of records in each grouping under 2000. The number of comparisons Warehouse Builder must perform is based on the following formula:

n=(b*(b-1))/2 

where n is number of comparisons and b is number of records in a bin.

To match 5 records, Warehouse Builder must perform 10 comparisons. To match 50 records, Warehouse Builder must perform 1,225 comparisons. To match 500 records, Warehouse Builder must perform 124,750 comparisons.

Match Rules

You can define match rules for a single attribute or multiple attributes in the operator. On the Match Rules tab, create match rules at the top of the page. In the lower portion of Match Rules tab, specify the details for each match rule.

If you create more than one match rule, Warehouse Builder determines two rows match if those rows satisfy any of the match rules. In other words, Warehouse Builder evaluates multiple match rules using OR logic. This is indicated on the Match Rules tab by the OR icon in the left most column. For more information, see "Understanding Matching Concepts".

Surrounding text describes OR_rule.gif.

Warehouse Builder assigns a position number and creates a default name such as MM_MA_0 for each match rule you add. You can edit and drag the rule to a new position in the rule list. You can designate a match rule as either Active Match Rules or Passive Match Rules. Warehouse Builder does not execute a passive rule unless you call it through a custom match rule. Assign one of the rule types listed in Table 8-7. When you select a rule type, the lower portion of the Match Rules tab activates and you can enter details for the match rule.

When you add details for conditional rule types, Warehouse Builder prompts you to add details for one or more attributes. When you add details for multiple attributes, Warehouse Builder displays the AND icon in the left most column. This indicates that Warehouse Builder matches rows only when all the condition details are satisfied.

Surrounding text describes AND_rule.gif.

Active Match Rules

Warehouse Builder executes a match rule if you designate as active. If more than one match rule is active, each is evaluated in turn until a match is found or until all rules are evaluated. If a match occurs, the records are considered a match.

Passive Match Rules

Warehouse Builder does not directly execute passive match rules. Instead, you can create custom match rules that are active and call passive match rules. For each match rule that you define, Warehouse Builder creates a corresponding function with the same name as the match rule. Custom rules can call the passive match rule using this function.

Custom Match Rule

Use this editor to create a custom comparison algorithm for the match rule. Select Edit to launch the Custom Match Rule Editor. Double click or drag and drop desired functions and parameters from the navigation tree on the left to the Implementation editor on the right. You can write a custom match rule that references active and passive match rules, functions, and parameters such as THIS_ and THAT_ which represent the two records from INGRP1 that are being compared.

You can also validate your custom rule by selecting Test and Validation from the Custom Match Rule Editor

Merge Rules

Use the Merge Rules tab to select values for the attributes in the merged record.

On the Merge Rules tab, create merge rules at the top of the page. In the lower portion of the Merge Rules tab, specify the details for each merge rule.

Warehouse Builder assigns a position number and creates a default name such as MM_ME_0 for each merge rule you add. Warehouse Builder executes merge rules in the order of their position numbers. You can edit and drag the rule to a new position in the rule list. Assign one of the rule types listed in Table 8-6.

Table 8-6 Merge Rule Types

Merge Rule Type Select Single or Multiple Attributes Description

Any

Single

Select an attribute and Warehouse Builder assigns the first non-blank value for that attribute.

Any Record

Multiple

In the lower portion of the tab, select multiple attributes and Warehouse Builder assigns those attributes based on any matched row.

Copy

Single

Select an attribute and Warehouse Builder assigns the value of another merged attribute.

Custom

Single

Select an attribute. Warehouse Builder assigns a value based on a PL/SQL code that you write. See Custom Merge Rule for details.

Custom Record

Multiple

In the lower portion of the tab, select multiple attributes. Warehouse Builder assigns a value based on a PL/SQL code that you write. See Custom Merge Rule for details.

Match Id

Single

Use this rule when you map the XREF group from one Match-Merge operator into the input of another Match-Merge operator. Select a sequence from the lower portion on the tab to generate the Match Id.

Min Max Record

Multiple

In the lower portion of the tab, select multiple attributes. Warehouse Builder assigns the merge attribute value from the match record that contains the relative value of the selecting attribute.

Min Max

Single

Select the first value out of the possible matches based on the order of another attribute.

Rank

Single

Warehouse Builder assigns values to records based on rank expressions you define. Select a record to use to populate a group of attributes. When you select the record, you have access to all attributes of all records in the matched record set.

Rank Record

Multiple

In the lower portion of the tab, select multiple attributes. Warehouse Builder assigns values based on rank expressions you define.

Sequence

Single

In the lower portion of the tab, select the sequence from the sequences defined in the project.


When you select a rule type, the lower portion of the Merge Rules tab activates and you can enter details for the merge rule.

When you define a merge rule, you can define one rule for all the attributes in the merged record or define a rule for each attribute. For instance, if the merged record is a customer record, it may have attributes such as ADDRESS1, ADDRESS2, CITY, STATE, and ZIP. You can write five rules, one for each attribute. Or, you can write one rule such that all five attributes come from the same matched row.

When you write a rule for a single attribute, specify the attribute in the Attribute list box at the top of the page. Then, if necessary, specify the details for the rule in the lower portion of the tab.

When you write a rule for multiple attributes, Warehouse Builder disables the Attribute list box at the top of the page and you must define the details in the lower portion of the tab.

Custom Merge Rule

Use this editor to create a custom merge rule that assigns a value based on a PL/SQL code that you write and returns the attribute type you previously selected.

Select Edit to launch the Custom Merge Rule Editor. Double click or drag and drop desired functions and attributes from the navigation tree on the left to the Implementation editor on the right. You can write a custom merge rule that references source attributes, other merge attributes, and functions.

You can also validate your custom rule by selecting Test and Validation from the Custom Merge Rule Editor.

Following is the Example of a Custom Merge Rule for an Attribute:

BEGIN
RETURN M_MATCHES(1)."TAXID";
END;

Following is the Example of a Custom Merge Rule for a Record:

BEGIN
RETURN M_MATCHES(1);
END;

Understanding Matching Concepts

When you use Warehouse Builder to match records, you can define a single match rule or multiple match rules. If you create more than one match rule, Warehouse Builder determines two rows match if those rows satisfy any of the match rules. In other words, Warehouse Builder evaluates multiple match rules using OR logic. Table 8-7 lists the match rules you can specify.

Table 8-7 Match Rule Types

Match Rule Description

Address

Matches records based on postal addresses. Indicate the attribute(s) that compose the address. Assign input roles to each attribute. For details, see "Address Match Rule".

All Match

Matches all the rows within the Match Bin.

Conditional

Matches rows based on an algorithm you select. Warehouse Builder matches rows only when all the condition details are satisfied. For details, see "Conditional Match Rule".

Custom

Create a custom comparison algorithm. Select Edit to launch the Custom Match Rule Editor. For more information, see "Custom Match Rule".

Firm

Matches records based on business names. Indicate the attribute(s) that compose the firm name. Assign input roles to each attribute. For details, see "Firm Match Rule".

None Match

Specifies that no rows match within the Match Bin.

Person

Matches records based on peoples names. Indicate the attribute(s) that compose the person name. Assign input roles to each attribute. For details, see "Person Match Rule".

Weight

Matches rows based on scores that you assign to attributes. Warehouse Builder compares each attribute using a similarity algorithm that returns a score between 0 and 100 to represent the similarity between the rows. For two rows to be considered a match, the total counts must be greater than the overall score you designate. For details, see "Weight Match Rule".


Example of Multiple Match Rules

The following example illustrates how Warehouse Builder evaluates multiple match rules using OR logic.

In the top portion of the Match Rules tab, create two match rules as described in Table 8-8:

Table 8-8 Two Match Rules

Name Position Rule Type Usage Description

Rule_1

1

Conditional

Active

Match SSN

Rule _2

2

Conditional

Active

Match LastName and PHN


In the lower portion of the tab, assign the details to Rule_1 as described in Table 8-9:

Table 8-9 Details for Rule_1

Attribute Position Algorithm Similarity Score Blank Matching

SSN

1

Exact

0

Do not match if either is blank


For Rule_2, assign the details as described in Table 8-10:

Table 8-10 Details for Rule_2

Attribute Position Algorithm Similarity Score Blank Matching

LastName

1

Exact

0

Do not match if either is blank

PHN

2

Exact

0

Do not match if either is blank


Assume you have the data listed in Table 8-11:

Table 8-11 Example Data

Row FirstName LastName PHN SSN

A

John

Doe

650-123-1111

NULL

B

Jonathan

Doe

650-123-1111

555-55-5555

C

John

Dough

650-123-1111

555-55-5555


According to Rule_1, rowsB and C match. According to Rule_2, rows A and B match. Therefore, since Warehouse Builder handles match rules using OR logic, all three records match.

Example of Transitive Matching

The general rule is, if A matches B, and B matches C, then A matches C. Assign a conditional match rule based on similarity such as described in Table 8-12:

Table 8-12 Conditional Match Rule

Attribute Position Algorithm Similarity Score Blank Matching

LastName

1

Similarity

80

Do not match if either is blank


Assume you have the data listed in Table 8-13:

Table 8-13 Sample Data

Row FirstName LastName PHN SSN

A

John

Jones

650-123-1111

NULL

B

Jonathan

James

650-123-1111

555-55-5555

C

John

Jamos

650-123-1111

555-55-5555


Jones matches James with a similarity of 80 and James matches Jamos with a similarity of 80. Jones does not match Jomos because the similarity is 60 which is less than 80. However, because Jones matches James, and James matches Jamos, all three records match.

Address Match Rule

Use the Address match rule to match records based on postal addresses. Matching by address is most effective when you first correct the address data using the Name-Address operator before the Match-Merge operator. The Name-Address operator identifies addresses as existing in a postal matching database and designates the records with the Is Found flag. The Match-Merge operator processes addresses with the Is Found role faster because the data is known to be syntactically correct, legal, and existing.

To define an Address match rule, complete the following steps:

  1. On the Match Rules tab, select Address as the Rule Type.

    The Address Attributes tab and Details tab display at the bottom of the page.

  2. In the left panel of the Address Attributes tab, select the attribute that represents the primary address and click the left to right arrow button.

  3. Right-click Roles and designate that attribute as the Primary Address.

    You must perform this step. If you do not assign the Primary Address role, the match rule is ineffective and you cannot access the Details tab.

  4. Add other attributes and designate their roles as necessary. If you used the Name-Address operator to cleanse the address data, assign the Is Found role to the appropriate attribute. See Table 8-14 for the types of roles you can assign.

  5. Select the Details tab and select the applicable options as listed in Table 8-15.

Table 8-14 describes the Address Roles you can assign for the Address match rule.

Table 8-14 Address Roles

Role Description

Primary Address

For the match rule to be valid, you must assign this role to one attribute. The primary address can be the street address such as 100 Main Street or post office box such as PO Box 100.

Unit Number

For addresses with matching primary addresses, the operator compares unit numbers such as suite numbers, floor numbers, or apartment numbers. When the unit numbers are blank, the operator considers them a match. If only one unit number is blank, it is not considered a match unless you select the Match on blank secondary address option.

PO_Box

The operator compares the post office box number, the number portion of the primary address when it represents a PO Box. When the primary address represents a street address, the PO Box number is blank.

Dual_primary_address

For addresses with matching primary addresses, the operator compares Dual_primary_addresses which is an address that contains both a street address and a post office box.

Dual_unit_number

Assign this role only if also assigning the dual_primary_address role. The operator compares the Dual_unit_number in one record with the Unit_number and Dual_unit_number of another record. Unit numbers are considered a match if one or both are blank.

Dual_PO_Box

Assign this role only if also assigning the dual_primary_address role. The operator compares the Dual_PO_Box in one record with the PO_Box and Dual_PO_Box of another record.

City

Assign this role only if also assigning the State role. The matching behavior for this role depends on whether you previously used the Name-Address operator to cleanse the address data. For uncorrected address data, the operator compares each City. For corrected addresses, the operator only compares Cities when the postal codes do not match. If both City and State match, then the operator compares the address roles. Cities are considered a match if both are blank but not if only one is blank.

State

Assign this role only if also assigning the City role. The matching behavior for this role depends on whether you previously used the Name-Address operator to cleanse the address data. For uncorrected address data, the operator compares each State. For corrected addresses, the operator only compares States when the postal codes do not match. If both City and State match, then the operator compares the address roles. States are considered a match if both are blank but not if only one is blank.

Postal_code

The matching behavior for this role depends on whether you previously used the Name-Address operator to cleanse the address data. For uncorrected address data, the operator does not use the Postal_code. For corrected addresses, the operator only compares each Postal_code. If the Postal_codes match, then the operator compares the address roles. If the Postal_codes do not match, then the operator compares City and State to determine if it should compare address roles such as the Primary_address.

Is_found

Assign this role for address data that you previously cleansed and standardized with Name-Address operator. The Name-Address operator marks records with the Is_found flag when it identifies the address as part of a country postal matching database.


Table 8-15 describes the options you can assign to Address Roles.

Table 8-15 Options for Address Roles

Option Description

Allow differing secondary address

Addresses match despite different unit numbers.

Match on blank secondary address

Addresses match despite one blank unit number.

Match on either street or post office box

The operator matches records if either the street address or post office box match.

Address line similarity

The operator ignores all spaces and non-alphanumeric characters in the address lines and calculates the similarity. Records with a similarity greater than or equal to the score you assign are considered a match.

Last line similarity

The operator ignores all spaces and non-alphanumeric characters in City and State and calculates the similarity. Records with a similarity greater than or equal to the score you assign are considered a match.


Conditional Match Rule

Use the Conditional Match Rule to combine multiple attribute comparisons into one composite rule. When you assign multiple attributes for comparison, all the comparisons must be true for the records to be considered a match.

To define a Conditional match rule, complete the following steps:

  1. On the top portion of the Match Rules tab, select Conditional for the rule type.

    The operator displays a Details section at the bottom of the tab.

  2. Click Add to add and select an attribute.

  3. For Algorithm, select one of the options as listed in Table 8-16.

  4. If you select Similarity or Standardized Similarity, specify a similarity score.

  5. In Blank Matching, specify how the operator should handle blank values.

Table 8-16 Algorithms for Match Rules

Algorithm Description

Exact

The attributes match if their values are exactly the same. For example, "Dog" and "dog!" would not match because the second string is not capitalized and contains an extra character. For data types other than String, this is the only type of comparison allowed.

Standardized Exact

The operator standardizes the values of the attribute before comparing for an exact match. With standardization, the comparison ignores case, spaces, and non-alphanumeric characters. Using this algorithm, "Dog" and "dog!" do match.

Similarity

Enter a similarity score between 0-100. If the similarity of the two attributes is equal or greater to the score, then the attribute values are considered matched. The similarity algorithm computes the edit distance between two strings. Edit distance is the number of deletions, insertions, or substitutions required to transform one string into another. A similarity value of 100 indicates that the two values are identical. A value similarity of zero indicates no similarity whatsoever. For example, if the string "tootle" is compared with the string "tootles", then the edit distance is 1. The length of the string "tootles" is 7. The similarity value is therefore 6/7*100 or 85.

Standardized Similarity

The operator standardizes the values of the attribute before using the Similarity algorithm to determine a match. With standardization, the comparison ignores case, spaces, and non-alphanumeric characters.

Soundex

The operator converts the data to a Soundex representation and compares the text strings. If the Soundex representations match, then the two attribute values match.

Partial Name

The values of a string attribute are considered a match if the value of one attribute is contained within the other, starting with the first word. For example, "Midtown Power" would match "Midtown Power and Light," but would not match "Northern Midtown Power". The comparison ignores case and non-alphanumeric characters. Before attempting to match a partial name, this algorithm performs a Standardized Exact comparison on the entire string.

Abbreviation

The values of a string attribute are considered a match if one string contains words that are abbreviations of corresponding words in the other. Before attempting to find an abbreviation, the operator performs a Standardized Exact comparison on the entire string. The comparison ignores case and non-alphanumeric characters.

For each word, the operator looks for abbreviations, as follows. If the larger of the words contains all of the letters from the shorter word and the letters appear in the same order as the shorter word, then the words are considered a match. For example, "Intl. Business Products" would match "International Bus Prd".

Acronym

The values of a string attribute are considered a match if one string is an acronym for the other. Before attempting to identify an acronym, the operator performs a Standardized Exact comparison on the entire string. If no match is found, then each word of one string is compared to the corresponding word in the other string. If the entire word does not match, each character of the word in one string is compared to the first character of each remaining word in the other string. If the characters are the same, the names are considered a match. For example, "Chase Manhattan Bank NA" matches "CMB North America". The comparison ignores case and non-alphanumeric characters.


Firm Match Rule

Use the Firm match rule to match records by business name. Matching by business name is most effective when you first correct the address data using the Name-Address operator before the Match-Merge operator.

To define a Firm match rule, complete the following steps:

  1. On the Match Rules tab, select Firm as the Rule Type.

    The Firm Attributes tab and Details tab display at the bottom of the page.

  2. In the left panel of the Firm Attributes tab, select the attribute that represents the firm name and click the left to right arrow button.

  3. Right-click Roles and designate that attribute as Firm 1.

    By default, the operator compares the values in Firm 1 for exact matches. You can change this default behavior by making selections on the Details tab.

  4. Add another attributes and designate it role as Firm 2, if necessary.

    For the match rule to be valid, you must assign at least one attribute either as Firm 1 or Firm 2.

  5. Select the Details tab and select the applicable options.

    If you select Strip noise words, the operator ignores words in the business names such as ÒtheÓ and ÒandÓ. If you select Cross match firm 1 and firm 2, the operator compares business names in firm 1 with business names in firm 2.

    See Table 8-16 for descriptions of the remaining options on the Details tab.

Person Match Rule

Use the Person match rule to match records based on names. Matching by names is most effective when you first correct the address data using the Name-Address operator before the Match-Merge operator.

To define a Person match rule, complete the following steps:

  1. On the Match Rules tab, select Name as the Rule Type.

    The Person Attributes tab and Details tab display at the bottom of the page.

  2. In the left panel of the Person Attributes tab, select the attribute that represents the last name and click the left to right arrow button.

  3. Right-click Roles and designate that attribute as the Last Name.

  4. Add other attributes and designate their roles as necessary. You must define either the Last Name or First Name Standardized for the match rule to be effective. See Table 8-17 for the types of roles you can assign.

  5. Select the Details tab and select the applicable options as listed in Table 8-18.

Table 8-17 describes the Name Roles you can assign for the Name match rule.

Table 8-17 Name Roles

Role Description

Prename

The operator compares prenames only if First Name Standardized is blank for one of the records, the ÒMrs.Ó option is selected, and the Last Name and any Middle Name role match. Given that criteria, the operator would match, for example, the record ÒMrs. William WebsterÓ with ÒMrs. WebsterÓ.

First Name Standardized

First names match if both are blank. A blank first name will not match a non-blank first name unless the Prename role has been assigned and the "Mrs. Match" option is set.

Middle Name Standardized, Middle Name 2 Standardized, Middle Name 3 Standardized

The operator compares and cross compares any of the middle names assigned. By default, the middle names must match exactly. Middle names match if either or both are blank. To assign any of the middle name roles, you must also assign the First Name Standardized role.

Last Name

The operator assigns last names as matching if both are blank and not matching if only one is blank.

Maturity Post Name

This is the same as post names such as ÒJr.Ó and ÒSr.Ó. The operator assigns these as matching if the values are exact or if either is blank.


Table 8-18 lists the options you can select from the Details tab in the Name match rule.

Table 8-18 Options for the Name Role

Option Description

Match on initials

The operator matches initials to names such as ÒR.Ó and ÒRobertÓ. You can select this option for first name and middle name roles.

Match on substrings

The operator matches substrings to names such as ÒRobÓ to ÒRobertÓ. You can select this option for first name and middle name roles.

Similarity

See Table 8-16.

Soundex

See Table 8-16.

Detect compound name

The operator matches compound names to names such as ÒDe AnneÓ to ÒDeanneÓ. You can select this option for the first name role.

ÒMrsÓ Match

The operator matches prenames to first and last names such as 'Mrs. WashingtonÓ to ÒGeorge WashingtonÓ. You can select this option for the prename role.

Match hyphenated names

The operator matches hyphenated last names to unhyphenated last names such as ÒReese-JonesÓ to ÒReeseÓ. You can select this option for the last name role.

Detect missing hyphen

The operator detects missing hyphens such as matching ÒHillary Rodham ClintonÓ to ÒHillary Rodham-ClintonÓ. You can select this option for the last name role.

Detect switched name order

The operator detects switched name orders such as matching ÒElmer FuddÓ to ÒFudd ElmerÓ. You can select this option if you selected First Name and Last Name roles for attributes on the Person Attributes tab.


Weight Match Rule

Use this rule to match rows based on a weight value that you assign. A weighted match rule is most useful when comparing a large number of attributes, without having a single attribute that is different causing a non-match, as can happen with conditional rules.

To use the Weight match rule, complete the following steps:

  1. On the Match Rules tab, select Weight as the Rule Type.

    The Details tab display at the bottom of the page.

  2. Select Add at the bottom of the page to add an attribute to the rule.

    In Maximum Score, assign a weight to each attribute you want to include in the comparison. Warehouse Builder compares each attribute using a similarity algorithm that returns a score between 0 and 100 to represent the similarity between the rows. A value of 100 indicates that the two values are identical. A value of zero indicates there is no similarity.

  3. In Required score to match, assign an overall score for the match.

    For two rows to be considered a match, the total counts must be greater than the Required score you designate.

Example of a Weight Match Rule

Assume you want to apply the Weight match rule to the data in Table 8-19:

Table 8-19 Example Records for Matching

Record Number Attr_1 Attr_2

Rec_1

CA

QQ

Rec_2

CA

QQ

Rec_3

CA

QR


For Maximum score, you assign a value of 50 to both Att_1 and Att_2. You assign a value of 80 for the Required score to match. You can expect the following results:

  • Rec_1 is the driver record. The operator reads it first.

  • In Rec_2, the value for Attr_1 is CA. That value has a similarity of 100 with the value in the driver record, Rec_1. Since the weight value for Attr_1 is 50, its score is 50 (100% of 50).

  • In Rec_2, the value for Attr_2 is QQ and has a similarity of 100. The weight value for Attr_2 is also 50 and its score is therefore 50 (100% of 50). The total maximum score is 100 (50 + 50). This equals or exceeds the value of the Required score for match and Rec_2 and Rec_1 are considered matched.

  • In Rec_3, Attr_1 is CA and has a similarity of 100 with Rec_1. Since the weight value for Attr_1 is 50, its weighted score is 50 (100% of 50).

  • In Rec_3, the value for Attr_2 is QR and that has a similarity of 50. The maximum value for Attr_2 is 50, so its score is 25 (50% of 50). The total weighted score is 75 (50+25). This is less than the value of the Required score to match. Therefore, Rec_3 and Rec_1 do not match.

Refining Data from Match-Merge Operators

Once you pass data through a Match-Merge operator, you may want to further refine the data. For example, when householding name and address data, you may need to merge the data first for addresses and then again for names. Assuming you map the MERGE output to a target table, you can map the XREF group to either a staging table or to another Match-Merge operator. Although mapping to a staging table is relatively easy to design, it can lead to significant loss of performance. Map the XREF group directly to another Match-Merge operator to avoid loss of performance.

Figure 8-19 shows a mapping that relies on a staging table between two Match-Merge operators MM and MM_1.

Figure 8-19 Householding Data: Mapping with Staging Table

Surrounding text describes Figure 8-19 .

Figure 8-20 shows a mapping that achieves the same results with better performance. The XREF group from MM is mapped directly to MM_1. For this mapping to be valid, you must assign the Match Id generated for the first XREF group as the Match Bin rule on the second Match-Merge operator.

Figure 8-20 Householding Data: XREF Group Mapped to Second Match-Merge Operator

Surrounding text describes Figure 8-20 .

Name and Address Operator

This section includes the following topics:

This section contains introductory material followed by instructions for using the operator. For advanced information on the Name and Address operator, see Chapter 20, "Data Quality: Name and Address Cleansing".

About the Name and Address Operator

Oracle Warehouse Builder enables you to perform name and address cleansing on your data with the Name and Address operator. The Name and Address operator identifies and corrects errors and inconsistencies in name and address source data by comparing input data to the data libraries supplied by the third-party name and address cleansing software vendors. You can purchase the data libraries directly from these vendors.

Note:

Taking advantage of the Name and Address operator requires separate licensing and additional installation steps. Refer to the Oracle Warehouse Builder Installation and Configuration Guide for more information.

The errors and inconsistencies corrected by the Name and Address operator include variations in address formats, use of abbreviations, misspellings, outdated information, inconsistent data, or transposed names. The operator fixes these errors and inconsistencies by:

  • Parsing, or separating, name and address input data into individual elements.

  • Standardizing name and address data, such as standardized versions of nicknames and business names or standard abbreviations of address components, as approved by the Postal Service of the appropriate country. Standardized versions of names and addresses facilitate matching, householding, and ultimately help you obtain a single view of your customer.

  • Correcting address information such as street names and city names. Filtering out incorrect or undeliverable addresses can lead to savings on marketing campaigns.

  • Augmenting names and addresses with additional data such as gender, ZIP+4, country code, apartment identification, or business and consumer identification. You can use this and other augmented address information, such as census geocoding, for marketing campaigns that are based on geographical location.

    Augmenting addresses with geographic information facilitates geography-specific marketing initiatives, such as marketing only to customers in large metropolitan areas (for example, within an n-mile radius from large cities); marketing only to customers served by a company's stores (within x mile radius from these stores). Oracle Spatial, an option with Oracle Database, and Oracle Locator, packaged with Oracle Database, are two products that you can use in conjunction with this feature.

Finally, the Name and Address operator enables you to generate postal reports for countries that support address correction and postal matching. Using postal reports in countries that support this feature often qualifies you for mailing discounts. For more information, see "Postal Reporting".

Example: Following a Record Through the Name and Address Operator

This example follows a record through a mapping using the Name and Address operator. This mapping also uses a Splitter operator to demonstrate a highly recommended data quality error handling technique. For more details about how data is processed by the Name and Address operator, see Chapter 20, "Data Quality: Name and Address Cleansing".

Example Input

In this example, your source data contains a Customer table with the row of data shown in Table 8-20.

Table 8-20 Sample Input to Name and Address Operator

Address Column Address Component

Name

Joe Smith

Street Address

8500 Normandale Lake Suite 710

City

Bloomington

ZIP Code

55437


This data contains a nickname, a last name, and part of a mailing address, but it lacks the customer's full name, complete street address, and the state in which he lives. The data also lacks geographic information such as latitude and longitude, which can be used for distance calculations for truckload shipping. In order to complete the name and address data, you can use the Name and Address operator.

Example Steps

This example uses a mapping with a Name and Address operator followed by a Splitter operator to cleanse name and address records and then load them into separate targets depending on whether they were successfully parsed. This section explains the general steps required to design such a mapping. For detailed information on each type of operator, refer to that operator's description in this chapter.

To make the listed changes to the sample record:

  1. In the Mapping Editor, begin by adding the following operators to the canvas:

    A CUSTOMERS table from which you extract the records. This is your data source. It contains the data in "Example Input".

    A Name and Address operator. This action launches the Name and Address Wizard. Complete it following the instructions in "Using the Name and Address Operator in a Mapping".

    A Splitter operator. For information on using this operator, see "Splitter Operator".

    Three target operators to which you load (respectively):

    The successfully parsed records

    The records with parsing errors

    The records whose addresses are parsed but not found in the postal matching software

  2. Map the attributes from the CUSTOMERS table to the Name and Address operator ingroup. Map the attributes from the Name and Address operator outgroup to the Splitter operator ingroup.

    You are not required to use the Splitter operator in conjunction with the Name and Address operator, but the combination highlights the Name and Address error handling capabilities.

  3. Define the split conditions for each of the outgroups in the Splitter operator and map the outgroups to the targets.

Figure 8-21 shows a mapping designed for this example. The data is mapped from the source table to the Name and Address operator, and then to the Splitter operator. The Splitter operator separates the successfully parsed records from those that have errors. The output from OUTGRP1 is mapped to the CUSTOMERS_GOOD target. The Split Condition for OUTGRP2 is shown at the bottom of the screen: records whose Is Parsed flag is False are loaded to the NOT_PARSED target. Records in the REMAINING_RECORDS group are successfully parsed, but their addresses are not found by the postal matching software. These records are loaded to the PARSED_NOT_FOUND target.

Figure 8-21 Name and Address Operator Used with a Splitter Operator in a Mapping

Surrounding text describes Figure 8-21 .

Example Output

If you run the mapping designed in this example, the Name and Address operator standardizes, corrects, and completes the address data from the source table. In this example, the target table contains the address data as shown in Table 8-21 (compare it with the input record from Table 8-20).

Table 8-21 Sample Output from Name and Address Operator

Address Column Address Component

First Name Standardized

JOSEPH

Last Name

SMITH

Primary Address

8500 NORMANDALE LAKE BLVD

Secondary Address

STE 710

City

BLOOMINGTON

State

MN

Postal Code

55437-3813

Latitude

44.849194

Longitude

-093.356352

Is Parsed

This field, indicating whether or not the record is parsed successfully, is added for error handling. Based on the value, the record will be loaded into the target operator for successfully parsed records, or the target operator for records with errors.

Is Good Name

This field is added for more detail on error handling.

Name Warning

This field is added to facilitate manual error correction.

Is Good Address

This field is added for more detail on error handling.

Is Found

This field, indicating whether or not the postal matching software found the address, can still be False for a successfully parsed address. It is added here to move those records to a separate target.

Street Warning

This field is added to facilitate manual error correction.

City Warning

This field is added to facilitate manual error correction.


In this example, the following changes were made to the input data:

  • Joe Smith was separated into separate columns for a First_Name_Standardized, and Last_Name.

  • Joe was standardized into JOSEPH and Suite was standardized into STE.

  • Normandale Lake was corrected to Normandale Lake BLVD.

  • The first portion of the postal code, 55437, was augmented with the ZIP+4 code to read 55437-3813.

  • Latitude and longitude locations were added.

  • Flags were added to distinguish records that parsed successfully from those that did not, and a separate target was loaded with records that had errors.

Using the Name and Address Operator in a Mapping

You have the following options for using the Name and Address operator:

  • Define a new Name and Address operator: Drag the Name and Address operator from the Toolbox onto the mapping. The Mapping Editor launches a wizard.

  • Edit an existing Name and Address operator: Right-click the operator and select Edit. The Mapping Editor opens the Name and Address Editor.

Whether you are using the operator wizard or the Operator Editor, complete the following pages:

The pages are listed in a columnar table that reads down the columns left to right to conserve space.

General

Use the General page to specify a name and optional description for the operator. By default, the wizard names the Name and Address operator ÒNAMEADDR.Ó

Definitions

Characterize the nature of your input data by assigning general definitions to this Name and Address operator.

Figure 8-22 shows the Definitions page containing sample values for the data described in Ò"Example: Following a Record Through the Name and Address Operator". Here, the Parsing Type is set to 'Name and Address', Primary Country is set to 'United States', and Dual Address Assignment is set to 'P.O. Box'.

Figure 8-22 Name and Address Operator Definitions Page

Surrounding text describes Figure 8-22 .
Parsing Type

Select one of the following parsing types from the drop-down list:

Note:

You can only specify the Parsing Type when you first add the Name and Address operator to your mapping. When you edit a Name and Address operator in the operator editor, you cannot modify the Parsing Type.
  • Name Only: Select this when the input data contains only name data. Names can include both personal and business names. Selecting this option instead of the more generic Name and Address option increases performance and accuracy.

  • Address Only: Select this when the input data contains only address data and no name data. Selecting this option instead of the more generic Name and Address option increases performance and accuracy.

  • Name and Address: Select this when the input data contains both name and address data. If your input data only contains one or the other, selecting one of the other options for optimal performance and accuracy.

Primary Country

Select the country which best represents the country distribution of your data. The primary country is used by some providers of name and address cleansing software as a hint for the appropriate parser or parsing rules to use on the initial parse of the record. For other name and address service providers, external configuration of their installation controls this behavior.

Dual Address Assignment

A dual address contains both a Post Office (PO) box and a street address for the same address record. For records that have dual addresses, your selection determines which address becomes the normal address and which address becomes the dual address. A sample dual address is:

PO Box 2589 4439 Mormon Coulee Rd La Crosse WI 54601-8231

Note that your choice for Dual Address Assignment affects which postal codes are assigned during postal code correction, because the street address and PO box address may correspond to different postal codes.

  • Street Assignment: The street address is considered the normal address and the PO Box address is considered the dual address. This means that the address component is assigned the street address. In the preceding example, it is assigned 4439 MORMON COULEE RD. This will cause the postal code to be corrected to 54601-8220.

  • PO Box Assignment: The PO Box address is considered the normal address and the street address is considered the dual address. This means that the address component is assigned the Post Office (PO) box address. In the preceding example, it is assigned PO BOX 2589. This will cause the postal code to be corrected to 54602-2589.

  • Closest to Last Line: Whichever address occurs closest to the last line is considered the normal address; the other is considered the dual address. This means that the address component is assigned the address line closest to the last line. In the preceding example, it is assigned the street address, 4439 MORMON COULEE RD. This will cause the postal code to be corrected to 54601-8220.

This option has no effect for records having a single street or PO box address. Note that this option may not be supported by all name and address cleansing software providers.

Groups

By definition, the Name and Address operator has one input group and one output group. You cannot edit, add, or delete groups in the Name and Address operator. The input group is called INGRP1 and the output group is OUTGRP1. You can edit these names. If your input data requires multiple groups, create a separate Name and Address operator for each group.

You assign attributes to the INGROUP on the Input Connections page and then edit those attributes on the Input Attributes page. You assign attributes to the OUTGRP1 group on the Output Attributes page.

Input Connections

Use the Input Connections page to select attributes from any operator in your mapping that you want to copy and map into the operator. The Available Attributes panel enables you to select attributes from any operator in your mapping. The Mapped Attributes panel represents the Name and Address operator. When you move attributes from the left panel to the right, you map them to the operator.

Figure 8-23 shows the Input Connections page containing sample values for the example described in "Example: Following a Record Through the Name and Address Operator". Notice that the CUSTOMERS table columns are mapped as the input attributes.

Figure 8-23 Name and Address Operator Input Connections Page

Surrounding text describes Figure 8-23 .

To complete the Input Connections page for an operator:

  1. Select complete groups or individual attributes from the Available Attributes panel. The Available Attributes panel enables you to select attributes from any operator in your mapping.

    To search for a specific attribute or group by name, type the text in Search for and click Go. To find the next match, click Go again.

    Hold the Shift key down to select multiple groups or attributes. If you want to select attributes from different groups, you must first combine the groups with a Joiner or Set operator.

  2. Use the left to right arrow button between the two panels to move your selections to the Mapped Attributes panel.

Input Attributes

Use the Input Attributes page to further modify the attributes you selected in the Input Connections page, and to assign input roles to each input attribute.

You can perform the following tasks from the Name and Address Input Attributes page:

  • Add attributes: Use the Add button to add input attributes.

  • Change attribute properties: You can change the following properties: attribute name, input role, length. Leave data type as VARCHAR2. You cannot change Precision or Scale. You must assign an Input Role for every input attribute.

    Input roles indicate what kind of name or address information resides in a line of data. For each attribute, select the Input Role that most closely matches the data contained in the source attribute. Refer to Table 20-1 for a complete list of input roles and their descriptions.

    You can select either non-discrete (line oriented) input roles for free-form data, or discrete roles (such as first name, primary address, or city) for specific input attributes. Whenever possible, choose discrete input roles (such as 'Person'), rather than non-discrete roles (such as 'Line1'). Discrete roles give the Name and Address operator more information about the content of the source attribute.

  • Add an optional description: Type a description for the input attributes.

Figure 8-24 shows the Input Attributes page containing sample values for the data described in "Example: Following a Record Through the Name and Address Operator". In this example, the source table has only one NAME column containing both the first and last names, such as Joe Smith, so this column is assigned the 'Person' input role. In the example, the source table also contains the entire street address portion of the mailing address in the STREETADDRESS column, so that column is assigned the 'Address' input role.

Figure 8-24 Name and Address Operator Input Attributes Page

Surrounding text describes Figure 8-24 .

Output Attributes

Use the Output Attributes page to define output attributes that determine how the Name and Address operator handles parsed data. Specifically, the output attribute properties characterize the data that is extracted from the parser output.

The output attribute collection is initially empty. You can create and edit attributes.

  • To create output attributes, select Add.

  • To edit an attribute name, click the appropriate cell and overwrite the default name.

  • You must specify an output component for every output attribute. To do this, click the ... button to the right of each attribute to open the Output Components dialog, where you can assign a component. See Table 20-2 for a complete list of the output components and descriptions of their functions.

  • You can also adjust the field length to match the length of the target attribute to which you intend to map the output attribute. Adjusting the length to match the target attribute helps avoid data truncation warnings during code generation, or errors during execution.

  • You cannot change the data type.

Figure 8-25 shows the Output Attributes page containing sample values for the data described in "Example: Following a Record Through the Name and Address Operator".

Figure 8-25 Name and Address Operator Output Attributes Page

Surrounding text describes Figure 8-25 .

Notice that every output attribute is assigned an output component. For example, the FirstName attribute is assigned the 'First Name Standardized' component. Notice also that 'Latitude' and 'Longitude' attributes are added to augment the address information.

Finally, notice that several error handling flags are added, such as Is Parsed, Is Good Name, and Is Good Address. These flags can be used with the Splitter operator to separate successfully parsed records from records with errors and load them into different targets.

Output Components

Use the Output Attributes components dialog to define components for the output attributes you create. Each output component represents a discrete name or address entity, such as a title, a standardized first name, a street number, a street name, or a street type, and indicates which component of a name or address an attribute constitutes. The component you select is assigned to the output attribute.

  • Output Component: Use the component tree on the left side of the Output Components dialog to expand the category housing your component, and then select the name or address component you want to assign to the attribute in question. You can select any node on the tree that is denoted by the envelope icon with a green border, even if that component expands to reveal other nodes. See Table 20-2, "Name and Address Operator Output Components" for a description of these components.

  • Address Type: Like Dual Address Assignment, this option may not be supported by all name and address cleansing software providers.This option only applies for dual addresses—addresses containing both a street address and a Post Office (PO) box or route-box address. The Dual Address Assignment option you specified in the Definitions page determines which address—the street address or the PO box address—is used as the dual address. Select either the normal or dual address. For more information on dual addresses, see "Dual Address Assignment".

  • Instance: Specify which instance of an output component to use when there are multiple occurrences of the same attribute in a single record. The instance control applies to all name components and several address components, such as Miscellaneous Address and Complex. Instance enables you to extract numerous attributes of the same nature.

    For example, an input record containing John and Jane Doe would have two name occurrences: John Doe and Jane Doe. You can extract the first person with any name component by assigning Instance 1 to that component. Similarly, you can also extract the second person using any name component by assigning Instance 2. The number of instances allowed for various components depends on the vendor of the name and address cleansing software you use. Miscellaneous address may also have multiple instances, for example, if both an email address and phone number are present.

Figure 8-26 shows the Output Components page selecting the component for the first sample output attribute used in "Example: Following a Record Through the Name and Address Operator".

Figure 8-26 Name and Address Operator Output Attributes Components Dialog

Surrounding text describes Figure 8-26 .

In "Example: Following a Record Through the Name and Address Operator", the address type is Normal. The example calls for the following output components: First Name Standardized, Last Name, Primary Address, Secondary Address, City, State, Postal Code, Latitude, Longitude, Is Parsed, Is Good Name, Name Warning, Is Good Address, Is Found, Street Warning, and City Warning.

Postal Report

Postal reporting applies only to countries that support address correction and postal matching. Country certification varies with different vendors of name and address cleansing software. The most common country certifications are United States, Canada, and Australia. The process provides mailers a common platform to measure the quality of address-matching software, focusing on the accuracy of postal codes (in the case of the United States, of five-digit ZIP Codes and ZIP+4 Codes), delivery point codes, and carrier route codes applied to all mail. Some vendors of name and address cleansing software may ignore these parameters and require external setup for generating postal reports. For more information, see "Postal Reporting".

  • Postal Report: If you select Yes for Postal Report, the Primary Country you chose in the Definitions page determines the country for which the postal report is generated. Only one postal report can be active.

  • Processor Name: The use of this field varies with vendors of name and address cleansing software. Typically, the value supplied here appears on the United States CASS report.

  • List Name: The list name is an optional reference field that appears on the United States and United Kingdom reports under the List Name section, but is not output on other reports. The list name provides a reference for tracking multiple postal reports; for example, 'July 2003 Promotional Campaign'.

  • Processor Address Lines: The four address lines may appear on various postal reports. Various name and address cleansing software vendors use these fields differently. These lines often contain the full address of your company.

Figure 8-27 shows the Postal Report page containing sample values.

Figure 8-27 Name and Address Operator Output Attributes Page

Surrounding text describes Figure 8-27 .

To exit the Name and Address Wizard, click Finish. To exit the Name and Address Editor, click OK.

Postal Reporting

You can specify a postal report using the Postal Report page in the Name and Address operator editor or wizard. The postal report is generated when the mapping containing the Name and Address operator is executed.

Postal reporting applies to only those countries that support address correction and postal matching. Those countries vary among vendors of name and address cleansing software. The most common certifications are for United States, Canada, and Australia.

The process provides mailers a common platform to measure the quality of address-matching software, focusing on the accuracy of postal codes (in the case of the United States, of five-digit ZIP Codes and ZIP+4 Codes), delivery point codes, and carrier route codes applied to all mail.

All address lists used to produce mailings for automation rates must be matched by postal report-certified software. Oracle Warehouse Builder Name and Address is built on name and address software and data supplied by third-party software vendors specializing in name and address cleansing. Certifications therefore depend on the vendor, and may include the following:

  • United States: Coding Accuracy Support System (CASS) certification with the United States Postal Service. The CASS report is a text file specified by the USPS and produced by Oracle Warehouse Builder Name and Address. To meet USPS requirements, the mailer must submit a CASS report in its original form to the USPS.

  • Canada: Software Evaluation and Recognition Program (SERP) certification with Canada Post. Customers who utilize Incentive Lettermail, Addressed Admail, or Publications Mail must meet the Address Accuracy Program requirements. Customers can obtain a Statement of Accuracy by comparing their database to Canada Post's address data.

  • Australia: Address Matching Approval System (AMAS®) certification with Australia Post. PreSort Letter Service prices require that customers use AMAS Approved Software with unique Delivery Point Identifiers (DPIDs) being current against the latest version of the Postal Address File (PAF) .

Accessing the Postal Report Files

To access the postal report, you must have access to the file system where the Name and Address Server resides. The reports are processed by the Name and Address Server and are written to the owb/bin/admin/reports folder, located in the Oracle home path specified during the Warehouse Builder Server-Side installation. For more information on installation parameters, see the Oracle Warehouse Builder Installation and Configuration Guide.

For each report, Warehouse Builder creates a unique file name using the Country Code, Group Name, and the date and time the file is created, for example: p_CAN_TESTGROUP1_20021219_0130.txt. This naming convention may not apply to all vendors of postal matching software, because in some cases, the file naming is controlled through external configuration of the vendor installation.

Postal Report Restrictions for International Data

A postal report-enabled mapping should only process that country's data, specified as the Primary Country in the Definitions page of the Name and Address wizard. If your source contains international data, and if the source records contain country codes, map the Country Code source column to an input group attribute of the Name and Address operator. Assign the 'Country Code' input role to the attribute.

Pivot Operator

The pivot operator enables you to transform a single row of attributes into multiple rows. Use this operator in a mapping when you want to transform data that is contained across attributes instead of rows. This situation can arise when you extract data from non-relational data sources such as data in a crosstab format.

Example: Pivoting Sales Data

The external table SALES_DAT, shown in Figure 8-28, contains data from a flat file. There is a row for each sales representative and separate columns for each month. For more information on external tables, see "Using External Tables".

Figure 8-28 SALES_DAT

Surrounding text describes Figure 8-28 .

Table 8-22 shows a sample of the data after Warehouse Builder performs a pivot operation. The data that was formerly contained across multiple columns (M1, M2, M3...) is now contained in a single attribute (Monthly_Sales). A single ID row in SALES_DAT corresponds to 12 rows in pivoted data.

Table 8-22 Pivoted Data

REP MONTH MONTHLY_SALES REGION

0675

Jan

10.5

4

0675

Feb

11.4

4

0675

Mar

9.5

4

0675

Apr

8.7

4

0675

May

7.4

4

0675

Jun

7.5

4

0675

Jul

7.8

4

0675

Aug

9.7

4

0675

Sep

NULL

4

0675

Oct

NULL

4

0675

Nov

NULL

4

0675

Dec

NULL

4


To perform the pivot transformation in this example, create a mapping like the one shown in Figure 8-29.

Figure 8-29 Pivot Operator in a Mapping

Surrounding text describes Figure 8-29 .

In this mapping, Warehouse Builder reads the data from the external table once, pivots the data, aggregates the data, and writes it to a target in set based mode. It is not necessary to load the data to a target directly after pivoting it. You can use the pivot operator in a series of operators before and after directing data into the target operator. You can place operators such as filter, joiner, and set operation before the pivot operator. Since pivoted data in Warehouse Builder is not a row-by-row operation, you can also execute the mapping in set based mode.

The Row Locator

In the pivot operator, the row locator is an output attribute that you create to correspond to the repeated set of data from the source. When you use the pivot operator, Warehouse Builder transforms a single input attribute into multiple rows and generates values for a row locator. In this example, since the source contains attributes for each month, you can create an output attribute named 'MONTH' and designate it as the row locator. Each row from SALES_DAT then yields 12 rows of pivoted data in the output.

Table 8-21 shows the data from the first row from SALES_DAT after Warehouse Builder pivots the data with 'MONTH' as the row indicator.

Table 8-23 Pivoted Data

REP MONTH MONTHLY_SALES REGION

0675

Jan

10.5

4

0675

Feb

11.4

4

0675

Mar

9.5

4

0675

Apr

8.7

4

0675

May

7.4

4

0675

Jun

7.5

4

0675

Jul

7.8

4

0675

Aug

9.7

4

0675

Sep

NULL

4

0675

Oct

NULL

4

0675

Nov

NULL

4

0675

Dec

NULL

4


Using the Pivot Operator

You have the following options for using a pivot operator:

  • Define a new pivot operator: Use the Pivot Wizard to add a new pivot operator to a mapping. Drag a pivot operator from the Toolbox onto the mapping. The Mapping Editor launches the Pivot Wizard.

  • Edit an existing pivot operator: Use the Pivot Editor to edit a pivot operator you previously created. Right click the operator and select Edit. The Mapping Editor opens the Pivot Editor.

Whether you are using the Pivot operator wizard or the Pivot Editor, complete the following pages:

The pages are listed in a columnar table that reads down the columns left to right to conserve space.

General

Use the General page to specify a name and optional description for the pivot operator. By default, the wizard names the operator ÒPivot.Ó

Groups

Use the Groups page to specify one input and one output group.

In a pivot operator, the input group represents the data from the source that is contained across multiple attributes. The output group represents that data transformed into rows.

You can rename and add descriptions to the input and output groups. Since each pivot operator must have exactly one input and one output group, the wizard prevents you from adding or removing groups or changing group direction.

Input Connections

Use the Input Connections page to copy and map attributes into the pivot operator. The attributes you select become mapped to the pivot input group. The left side of the page displays a list of all the operators in the mapping. Figure 8-30 shows a group from the external table SALES_DAT selected as input for the pivot operator.

Figure 8-30 Pivot Operator Input Connections Page

Surrounding text describes Figure 8-30 .

To complete the Input Connections page for a pivot operator:

  1. Select complete groups or individual attributes from the left panel.

    To search for a specific attribute or group by name, type the text in Search for and select Go. To find the next match, select Go again.

    Press the Shift key to select multiple attributes. If you want to select attributes from different groups, you must first combine the groups with a Joiner or Set operator.

  2. Use the left to right arrow button in the middle of the page to move your selections to the right side of the wizard page.

    Use the right to left arrow to remove groups or attributes from the input connections list. Warehouse Builder removes the selection from the input group and removes the mapping line between the source operator and the pivot operator. Figure 8-31 shows a group from SALES_DAT copied and mapped into the PIVOTSALES operator.

    Figure 8-31 Attributes Copied and Mapped into Pivot In Group

    Surrounding text describes Figure 8-31 .

Input Attributes

Use the Input Attributes page to modify the attributes you selected in the Input connections tab or wizard page.

You can perform the following tasks from the pivot Input Attributes page:

  • Add attributes: Use the Add button to add input attributes.

  • Change attribute properties: You can change the attribute name, data type, length, precision, and scale.

  • Add an optional description: Type a description for the input attributes.

  • Designate attribute keys: As an option, use the Key check box to indicate an attribute that uniquely identifies the input group.

Output Attributes

Use the Output Attributes page to create the output attributes for the pivot operator. If you designated any input attributes as keys on the Input Attributes tab or wizard page, Warehouse Builder displays those input attributes as output attributes that you cannot edit or delete. Figure 8-32 displays the output attributes with MONTH selected as the row locator.

Figure 8-32 Pivot Output Attributes Page

Surrounding text describes Figure 8-32 .

You can perform the following tasks from the pivot Output Attributes Page:

  • Change attribute properties: Except for attributes you designated as keys on the previous tab or wizard page, you can change the attribute name, data type, length, precision, and scale.

  • Add an optional description: Type a description for the output attributes.

  • Designate a row locator: Although you are not required to designate a row locator for the pivot operator, it is recommended. When you identify the row locator on the Output Attributes page or tab, it is easier for you to match your output data to the input data.

    In the pivot operator, the row locator is an output attribute that corresponds to the repeated set of data from the source. For example, if the source data contains separate attributes for each month, create an output attribute 'MONTH' and designate it as the row locator.

Pivot Transform

Use the Pivot Transform page to write expressions for each output attribute.

By default, Warehouse Builder displays two rows. Use Add to specify how many rows of output you want from a single row in the source. For example, if your source contains an attribute for each quarter in a year, you can specify 4 rows of output for each row in the source. If the source data contains an attribute for each month in the year, you can specify 12 rows of output for each row in the source.

Figure 8-33 shows the Pivot Transform tab with the pivot expressions defined for a source with an attribute for each month.

Figure 8-33 Pivot Transform Page

Surrounding text describes Figure 8-33 .

Write pivot expressions based on the following types of output:

  • Row locator: Specify a name for each row where the name is a value you want to load into the table. For example, if the row locator is 'MONTH', type 'Jan' for the first row.

  • Pivoted output data: Select the appropriate expression from the list box. For example, for the row you define as 'Jan', select the expression that returns the set of values for January.

  • Attributes previously specified as keys: Warehouse Builder defines the expression for you.

  • Unnecessary data: If the Pivot Transform page contains data that you do not want as output, use the expression 'NULL'. Warehouse Builder outputs a repeated set of rows with no data for attributes you define as 'NULL'.

When using the wizard to create a new pivot operator, click Finish when you want to close the wizard. The Mapping Editor displays the operator you defined.

When using the Pivot Editor to edit an existing pivot operator, click OK when you have finished editing the operator. The Mapping Editor updates the operator with the changes you made.

Post-Mapping Process Operator

Use a Post-Mapping Process operator to define a procedure to be executed after running a mapping. For example, you can use a Post-Mapping Process operator to reenable and build indexes after a mapping completes successfully and loads data into the target.

The Post-Mapping Process operator calls a function or procedure that is defined in Warehouse Builder after the mapping is executed. The output parameter group provides the connection point for the returned value (if implemented through a function) and the output parameters of the function or procedure. There are no restrictions on the connections of these output attributes

The Post-Mapping Process operator contains groups corresponding to the number and direction of the parameters associated with the selected PL/SQL procedure or function. This list of groups and attributes can only be modified through reconciliation.

You can only define one Post-Mapping Process operator for a mapping. If you want to run more than one procedure after a mapping, you must wrap the procedures into one procedure.

You can map constants, data generators, mapping input parameters, and output from a Pre-Mapping Process into a Post-Mapping Process operator. The Post-Mapping Process operator is not valid for an SQL*Loader mapping.

After you add a Post-Mapping Process operator to the Mapping Editor, use the operator properties dialog to specify run conditions in which to execute the process.

To use a Post-Mapping Process operator in a mapping:

  1. Drop a Post-Mapping Process operator onto the Mapping Editor canvas.

    Warehouse Builder displays the Add Mapping Transformation dialog.

  2. Use the Add Mapping Transformation dialog to select or create a transformation. For more information on how to use the Add Mapping Transformation dialog, see "Adding Bindable Operators".

  3. Connect the output attribute of a source operator to the input/output group of the Post-Mapping Process operator.

  4. Set the run conditions for the operator.

To set run conditions for a Post-Mapping Process operator:

  1. From the mapping canvas, right-click a Post-Mapping Process operator and select Operator Properties.

  2. Click Post-Mapping Process Run Condition and select one of the following run conditions:

    Always: The process runs regardless of errors from the mapping.

    On Success: The process runs only if the mapping completes without errors.

    On Error: The process runs only if the mapping completes with errors exceeding the number of allowed errors set for the mapping.

    On Warning: The process runs only if the mapping completes with errors that are less than the number of allowed errors set for the mapping.

    If you select On Error or On Warning and the mapping runs in row based mode, you must verify the Maximum Number of Errors set for the mapping. To view the number of allowed errors, right-click the mapping in the navigation tree, select Configure, and expand Runtime Parameters.

Pre-Mapping Process Operator

Use a Pre-Mapping Process operator to define a procedure to be executed before running a mapping. For example, you can use a Pre-Mapping Process operator to truncate tables in a staging area before running a mapping that loads tables to that staging area. You can also use a Pre-Mapping Process operator to disable indexes before running a mapping that loads data to a target. You can then use a Post-Mapping Process operator to reenable and build the indexes after running the mapping that loads data to the target.

The Pre-Mapping Process operator calls a function or procedure whose metadata is defined in Warehouse Builder prior to executing a mapping. The output parameter group provides the connection point for the returned value (if implemented with a function) and the output parameters of the function or procedure. There are no restrictions on the connections of these output attributes.

When you drop a Pre-Mapping Process operator onto the Mapping Editor canvas, a dialog opens displaying the available libraries, categories, functions, and procedures. After you select a function or procedure from the tree, the operator displays with predefined input and output parameters.

The Pre-Mapping Process operator contains groups corresponding to the number and direction of the parameters associated with the selected PL/SQL procedure or function.

A mapping can only contain one Pre-Mapping Process operator. Only constants, mapping input parameters, and output from a Pre-Mapping Process can be mapped into a Post-Mapping Process operator.

After you add a Pre-Mapping Process operator to the Mapping Editor, use the operator property dialog to specify conditions in which to execute the mapping.

To use a Pre-Mapping Process operator in a mapping:

  1. Drop a Pre-Mapping Process operator onto the Mapping Editor canvas.

    The Add Mapping Transformation dialog displays.

  2. Use the Add Mapping Transformation dialog to select or create a transformation. For more information on how to use the Add Mapping Transformation dialog, see "Adding Bindable Operators".

  3. Connect the output attribute of the Pre-Mapping Process operator to the input group of a target operator.

  4. Set the run conditions for the operator.

To set run conditions for a mapping with a Pre-Mapping Process operator:

  1. In the mapping canvas, right-click a Pre-Mapping Process operator and select Operator Properties.

  2. Click Mapping Run Condition and select one of the following run conditions:

    Always: Warehouse Builder runs the mapping after the process completes, regardless of the errors.

    On Success: Warehouse Builder runs the mapping only if the process completes without errors.

    On Error: Warehouse Builder runs the mapping only if the process completes with errors.

Set Operation Operator

The Set Operation operator enables you to use following set operations in a mapping:

By default, the Set Operation operator contains two input groups and one output group. You can add input groups by using the operator editor. Mapping attributes to a Set Operation input group creates corresponding attributes with the same name and data type in the Set Operation output group. The number of attributes in the output group matches the number of attributes in the input group containing the most number of attributes.

To use the Set Operation operator:

Corresponding attributes are determined by the order of the attributes within an input group. For example, attribute 1 in input group 1 corresponds to attribute 1 in input group 2.

You must apply the set operation in top-down order. The order of the input groups determines the execution order of the set operation. This order only affects the minus operation. For example, A minus B is not the same as B minus A. The order of the attributes within the first input group determines the structure of a set. For example, {empno, ename} is not the same as {ename, empno}.

To use the Set Operation operator in a mapping:

  1. Drag and drop a Set Operation operator onto the Mapping Editor canvas.

  2. Connect source attributes to the Set Operation operator groups.

  3. Right-click the operator header and select Operator Properties...

    The Set Operation properties window displays.

  4. Click the field to the right of the Set Operation property and select an operation from the drop-down list.

  5. Close the Set Operation properties window.

  6. Connect the Set Operation output group to a target input group.

Sorter Operator

You can produce a sorted row set using the Sorter operator. The Sorter operator enables you to specify which input attributes are sorted and whether the sorting is performed in ascending or descending order. Warehouse Builder sorts data by placing an ORDER BY clause in the code generated by the mapping.

The Sorter operator has one input/output group. You can use the Sorter operator to sort data from any relational database source. You can place any operator after the Sorter operator.

The Sorter operator contains the following property:

To use the Sorter operator in a mapping:

  1. Drop the Sorter operator onto the Mapping Editor canvas.

  2. Connect a source operator group to the Sorter input/output group as shown in Figure 8-34.

    Figure 8-34 Mapping Editor with a Sorter Operator

    Surrounding text describes Figure 8-34 .
  3. Right-click the Sorter operator header and select Operator Properties.

    The Sorter properties window displays.

  4. Click the ... button in the Order By Clause field.

    The Order By Clause dialog displays.

  5. Select the attributes you want to sort.

    Select an attribute from the Available Attributes list and click the right arrow button. Or, click the double right arrow button to select all of the Available Attributes.

  6. Apply an ORDER BY clause to the attribute.

    Select the attribute in the ORDER BY Attributes list and select ASC (ascending) or DESC (descending) from the drop-down list.

  7. Click OK.

Splitter Operator

You can use the Splitter operator to split data from one source to several targets. The operator splits a single input row set into several output row sets using a boolean split condition. Each output row set has a cardinality less than or equal to the input cardinality.

You can configure Warehouse Builder mappings that split data from one source to multiple targets to take advantage of Oracle9i functionality that optimizes SQL code and improves performance. For more information, see "Example: Creating Mappings with Multiple Targets".

The Splitter operator creates an output group called REMAINING_ROWS containing all input rows not included in any of the other output groups. You can delete this output group, but you cannot edit it.

The Splitter Operator contains the following properties:

To use the Splitter operator in a mapping:

  1. Drag and drop the Splitter operator onto the Mapping Editor canvas.

  2. Connect a group from a source operator to the input group of the Splitter operator.

    The output attributes are created with data types matching the corresponding input data types.

  3. Right-click the Splitter operator header and select Operator Properties.

    The Splitter properties window displays as shown in Figure 8-35.

    Figure 8-35 Group Properties Window for a Split Condition

    Surrounding text describes Figure 8-35 .
  4. Enter an expression in the Split Condition field. Or click ... to define an expression using the Expression Builder as shown in Figure 8-36.

    Figure 8-36 Expression Builder Showing A Split Condition

    Surrounding text describes Figure 8-36 .
  5. Close the Splitter Properties window.

  6. Define expressions for each of the output groups except for the REMAINING ROWS group.

  7. Connect the output groups to the targets.

    Figure 8-37 Mapping with a Single Source and Multiple Targets

    Surrounding text describes Figure 8-37 .

Example: Creating Mappings with Multiple Targets

When you design and configure a mapping with the Splitter operator, Warehouse Builder generates a multi_table_insert statement. This SQL statement takes advantage of parallel query and parallel DML services in the Oracle9i database server.

To create a mapping with multiple targets:

  1. Configure an Oracle target module that contains the mapping to validate and generate Oracle9i SQL.

    From Warehouse Builder, right-click the target module on the navigation tree and select Configuration Properties. Under Target Database Type, select Oracle9i.

  2. In the Mapping Editor, design a mapping with a single source, a Splitter operator, and multiple targets.

    The targets must be tables, not views or materialized views. Each target table must have less than 999 columns. Between the Splitter operator and the targets, do not include any operators that change the cardinality. For example, you can place a Filter between the Splitter and the targets as shown in Figure 8-38, but not a Joiner or Aggregator operator.

    Figure 8-38 Example Mapping with Multiple Targets

    Surrounding text describes Figure 8-38 .
  3. From the Warehouse Builder console, select the mapping from the navigation tree, select Object from the menu bar, and select Configure. You can also right-click the mapping you want to configure and select Configure.

    Warehouse Builder displays the configuration properties dialog for a mapping as shown in Figure 8-39.

    Figure 8-39 Configuration Properties Window for Mapping Tables

    Surrounding text describes Figure 8-39 .
  4. Expand Runtime Parameters Reference and set Default Operating Mode to set based.

  5. Expand Code Generation Options Reference and set Optimize Code to true.

    When you run this mapping and view the generation results, Warehouse Builder returns one total SELECT and INSERT count for all targets.

Table Function Operator

Table function operators enable you to manipulate a set of input rows and return another set of rows possibly of different cardinality. Unlike conventional functions, table functions can return a set of output rows that can be queried like a physical table.

Using table functions can greatly improve performance when loading your data warehouse.

Table Functions have the following characteristics:

Prerequisites for Using the Table Function Operator

Before you can use the Mapping Table Function operator in a mapping, you need to create the table function in your target, external to Warehouse Builder. The table functions in the database that are supported by the unbound table function operator must meet the following requirements:

Input

  • Ref Cursor returning PLS Record (the fields of the PLS Record) must be scalar data types supported by Warehouse Builder (0..n).

  • There must be at least one input parameter.

Output

TABLE OF:

  • PLS Record (the fields of the PLS Record should be scalar data types supported by Warehouse Builder).

  • Object Type (the attributes of the Object Type should be scalar data types supported by Warehouse Builder).

  • Scalar data types supported by Warehouse Builder.

  • ROWTYPE

For an unbound Mapping Table Function operator in a mapping:

  • You must add one parameter group for each ref cursor type parameter.

  • Multiple scalar parameters can be part of a single scalar type parameter group.

  • The parameter groups and the parameters in a group can be entered in any order.

  • The positioning of the parameters in the mapping table function operator must be the same as the positioning of the parameters in the table function created in your target warehouse.

Table Function Operator Properties

The Mapping Table Function operator contains the following properties:

General

General properties include the name and description. Specify the name of the table function located in the target database. The description is optional.

Input Parameter Group

The table function operator accepts the following types of input parameters:

  • Input Parameter Type: Valid input parameter types are REF_CURSOR_TYPE or SCALAR_TYPE.

  • REF_CURSOR_TYPE: Returns a PLS Record {0...N}. The fields of the PLS Record must be a scalar data type supported by Warehouse Builder.

  • SCALAR_TYPE: Scalar data types supported by Warehouse Builder.

  • Parameter Position: Indicates the position of the parameter in the table function signature corresponding to this parameter group.

Input Parameter

  • Parameter Position: The position of the parameter in the table function signature. This property is only applicable to scalar parameters.

Output Parameter Group

  • Return Table of Scalar: This property specifies whether the return of the table function is a TABLE of SCALAR or not. This information is required because the select list item for TABLE of SCALAR must be Select COLUMN_VALUE while in the other cases it should be an appropriate name.

Output Parameter

  • Type Attribute Name: The name of the field of the PLS Record, attribute of the Object Type, or column of the ROWTYPE. This property is not applicable if the return type is TABLE of SCALAR. This name is used to invoke the table function.

To use a Table Function operator in a mapping:

  1. Drag and drop a Mapping Table Function operator onto the Mapping Editor canvas.

  2. Connect the appropriate source attributes to the input group of the mapping table function operator.

    This automatically creates the input attributes.

  3. Right-click the Table Function operator and select Edit.

  4. From the Groups tab, select Add to add an output group.

Before you deploy the mapping containing the mapping table function operator, you must manually create the table function in the target warehouse. The mapping table function operator is bound to the actual table function object through the code generated by the mapping.

Transformation Operator

You use the Mapping Transformation operator to transform the column value data of rows within a row set using a PL/SQL function, while preserving the cardinality of the input row set.

The Mapping Transformation operator must be bound to a function or procedure contained by one of the modules in the repository. The inputs and outputs of the Mapping Transformation operator correspond to the input and output parameters of the bound repository function or procedure. If the Mapping Transformation operator is bound to a function, a result output is added to the operator that corresponds to the result of the function. The bound function or procedure must be generated and deployed before the mapping can be deployed, unless the function or procedure already exists in the target system.

Warehouse Builder provides pre-defined PL/SQL library functions in the runtime schema that can be selected as a bound function when adding a Mapping Transformation operator onto a mapping. In addition, you can choose a function or procedure from the Global Shared Library.

The Mapping Transformation operator contains the following properties:

To use a Mapping Transformation operator in a mapping:

  1. Drag and drop a Mapping Transformation operator onto the Mapping Editor canvas.

    The Add Mapping Transformation dialog displays.

  2. Use the Add Mapping Transformation dialog to create a new transformation or select one or more transformations. Warehouse Builder adds a transformation operator for each transformation you select. For more information on these options, see "Adding Bindable Operators" beginning.

  3. Connect the source attributes to the inputs of the Mapping Transformation operator.

  4. (Optional step) Right-click one of the inputs and select Attribute Properties.

    The Mapping Transformation properties window displays.

  5. Select an input attribute. If the Procedure property is set to True, then do not connect the input parameter.

  6. Close the attribute property window.

  7. Connect the Transformation operator output attributes to the target attributes.

Unpivot Operator

The unpivot operator converts multiple input rows into one output row. The unpivot operator enables you to extract from a source once and produce one row from a set of source rows that are grouped by attributes in the source data. Like the pivot operator, the unpivot operator can be placed anywhere in a mapping.

Example: Unpivoting Sales Data

Table 8-24 shows a sample of data from the SALES relational table. In the crosstab format, the 'MONTH' column has 12 possible character values, one for each month of the year. All sales figures are contained in one column, 'MONTHLY_SALES'.

Table 8-24 Data in a Crosstab Format

REP MONTH MONTHLY_SALES REGION

0675

Jan

10.5

4

0676

Jan

9.5

3

0679

Jan

8.7

3

0675

Feb

11.4

4

0676

Feb

10.5

3

0679

Feb

7.4

3

0675

Mar

9.5

4

0676

Mar

10.3

3

0679

Mar

7.5

3

0675

Apr

8.7

4

0676

Apr

7.6

3

0679

Apr

7.8

3


Figure 8-40 depicts data from the relational table 'SALES' after Warehouse Builder unpivoted the table. The data formerly contained in the 'MONTH' column (Jan, Feb, Mar...) corresponds to12 separate attributes (M1, M2, M3...). The sales figures formerly contained in the 'MONTHLY_SALES' are now distributed across the 12 attributes for each month.

Figure 8-40 Data Unpivoted from Crosstab Format

Surrounding text describes Figure 8-40 .

The Row Locator

When you use the unpivot operator, Warehouse Builder transforms multiple input rows into a single row based on the row locator. In the unpivot operator, the row locator is an attribute that you must select from the source to correspond with a set of output attributes that you define. A row locator is required in an unpivot operator. In this example, the row locator is 'MONTH' from the 'SALES' table and it corresponds to attributes M1, M2, M3... M12 in the unpivoted output.

Using the Unpivot Operator

You have the following options for using an unpivot operator:

  • Define a new unpivot operator: Drag an unpivot operator from the Toolbox onto the mapping. The Mapping Editor launches a wizard.

  • Edit an existing unpivot operator: Right-click the operator and select Edit. The Mapping Editor opens the Unpivot Editor.

Whether you are using the Unpivot operator wizard or the Unpivot Editor, complete the following pages:

The pages are listed in a columnar table that reads down the columns left to right to conserve space.

General

Use the General page to specify a name and optional description for the unpivot operator. By default, the wizard names the operator ÒUnpivot.Ó

Groups

Use the Groups page to specify one input and one output group.

In an unpivot operator, the input group represents the source data in crosstab format. The output group represents the target data distributed across multiple attributes.

You can rename and add descriptions to the input and output groups. Since each unpivot operator must have exactly one input and one output group, the wizard prevents you from adding or removing groups or changing group direction.

Input Connections

Use the Input Connections page to select attributes to copy and map into the unpivot operator.

To complete the Input connections page for an unpivot operator:

  1. Select complete groups or individual attributes from the left panel.

    To search for a specific attribute or group by name, type the text in Search for and click Go. To find the next match, click Go again.

    Hold the Shift key down to select multiple groups or attributes. If you want to select attributes from different groups, you must first combine the groups with a Joiner or Set operator.

  2. Use the left to right arrow button in the middle of the page to move your selections to the right side of the wizard page.

    You can use the right to left arrow to move groups or attributes from the input connections list. Warehouse Builder removes the selection from the input group and removes the mapping line between the source operator and the pivot operator.

Input Attributes

Use the Input Attributes page to modify the attributes you selected in the Input connections tab or wizard page.

You can perform the following tasks from the Unpivot Input Attributes Page:

  • Add attributes: Use the Add button to add input attributes.

  • Change attribute properties: You can change the attribute name, data type, length, precision and scale.

  • Add an optional description: Type a description for the input attributes.

  • Designate key attribute(s): You must designate one or more key attributes for unpivot operators. Use the Key check box to indicate the attribute(s) that uniquely identifies the input group. Input rows with the same value in their key attribute(s) produce one unpivoted output row.

Row Locator

Use the Row locator page to select a row locator and assign values to the distinct values contained in the row locator. Figure 8-41 shows the attribute MONTH selected as the row locator with values such as 'Jan', 'Feb', or 'Mar'.

Figure 8-41 Unpivot Row Indicator Page

Surrounding text describes Figure 8-41 .

To complete the Unpivot Row Locator page:

  1. Select an attribute from the Row locator list box.

    In the unpivot operator, the row locator is the attribute from the source data that corresponds to a set of output attributes.

  2. Use Add to specify the number of distinct values that exist in the row locator.

  3. For each row locator value, type in the value as it appears in your source dataset.

    For string values, enclose the text in single quotes. For example, if the row locator is 'MONTH', there would be a total of 12 distinct values for that attribute. Click Add to add a row for each distinct value. For row locator values, type values exactly as they appear in the source dataset. For instance, the row locator values as shown in Table 8-24 are 'Jan', 'Feb', and 'Mar.'

Output Attributes

Use the Output Attributes page shown in Figure 8-42 to create the output attributes for the unpivot operator.

Figure 8-42 Unpivot Output Attributes Page

Surrounding text describes Figure 8-42 .

If you designated any input attributes as keys on the Input Attributes tab or wizard page, Warehouse Builder displays those input attributes as output attributes that you cannot edit or remove.

You can perform the following tasks from the Pivot Output Attributes page:

  • Add attributes: Use Add to increase the number of output attributes to accommodate the rows you specified on the Row locator tab or wizard page. If you specified 12 rows, specify 12 output attributes plus attributes for any other input attributes that you did not designate as a key.

  • Change attribute properties: Except for attributes you designated as keys on the Input Attributes tab or wizard page, you can change the attribute name, data type, length, precision, and scale.

  • Add an optional description: Type a description for the output attributes.

Unpivot Transform

Use the Unpivot Transform page shown in Figure 8-43 to write expressions for each output attribute.

Figure 8-43 Unpivot Transform Page

Surrounding text describes Figure 8-43 .

For attributes you designated as keys, Warehouse Builder defines the matching row and expression for you. Warehouse Builder displays the first row as the match for a key attribute. For all other output attributes, specify the matching row and the expression.

  • Matching row: Select the appropriate option from the list box. For example, for the attribute you define as the first month of the year, 'M1', select 'Jan' from the list box.

  • Expression: Select the appropriate expression from the list box. For all the new attributes you created to unpivot the data, select the same input attribute that contains the corresponding data. For example, the unpivot attributes M1, M2, M3... M12 would all share the same expression, INGRP1.MONTHLY_SALES. For all other output attributes, select the corresponding attribute from the list of input attributes.