EPL Reference

     Previous  Next    Open TOC in new window    View as PDF - New Window  Get Adobe Reader - New Window
Content starts here

EPL Reference: Clauses

This section provides information on the following topics:

 


Overview of the Clauses You Can Use in an EPL Statement

The top-level BNF for the event processing language (EPL) is as follows:

  [ INSERT INTO insert_into_def ]
SELECT select_list
{ FROM stream_source_list / MATCHING pattern_expression }
[ WHERE search_conditions ]
[ GROUP BY grouping_expression_list ]
[ HAVING grouping_search_conditions ]
[ ORDER BY order_by_expression_list ]
[ OUTPUT output_specification ]

Literal keywords are not case sensitive. Each clause is detailed in the following sections. For information on the built-in operators and functions, see EPL Reference: Operators, and EPL Reference: Functions.

 


SELECT

The SELECT clause is required in all EPL statements. The SELECT clause can be used to select all properties using the wildcard *, or to specify a list of event properties and expressions. The SELECT clause defines the event type (event property names and types) of the resulting events published by the statement, or pulled from the statement.

The SELECT clause also offers optional ISTREAM and RSTREAM keywords to control how events are posted to update listeners attached to the statement.

The syntax for the SELECT clause is summarized below.

  SELECT [RSTREAM | ISTREAM] ( expression_list | * )

The following examples use the FROM clause which defines the sources of the event data. The FROM clause is described in FROM.

Choosing Specific Event Properties

To choose the particular event properties to return:

  SELECT event_property [, event_property] [, ...] 
FROM stream_def

The following statement selects the count and standard deviation of the volume for the last 100 stock tick events.

  SELECT COUNT, STDDEV(volume) 
FROM StockTick RETAIN 100 EVENTS

Using Expressions

The SELECT clause can contain one or more expressions.

  SELECT expression [, expression] [, ...] 
FROM stream_def

The following statement selects the volume multiplied by price for a time batch of the last 30 seconds of stock tick events.

  SELECT volume * price 
FROM StockTick RETAIN BATCH OF 30 SECONDS

Aliasing Event Properties

Event properties and expressions can be aliased using below syntax.

  SELECT [event_property | expression] AS identifier [,…]

The following statement selects volume multiplied by price and specifies the name volPrice for the event property.

  SELECT volume * price AS volPrice 
FROM StockTick RETAIN 100 EVENTS

Choosing All Event Properties

The syntax for selecting all event properties in a stream is:

  SELECT * 
FROM stream_def

The following statement selects all of the StockTick event properties for the last 30 seconds:

  SELECT * 
FROM StockTick RETAIN 30 SECONDS

In a join statement, using the SELECT * syntax selects event properties that contain the events representing the joined streams themselves.

The * wildcard and expressions can also be combined in a SELECT clause. The combination selects all event properties and in addition the computed values as specified by any additional expressions that are part of the SELECT clause. Here is an example that selects all properties of stock tick events plus a computed product of price and volume that the statement names pricevolume:

  SELECT *, price * volume AS pricevolume 
FROM StockTick RETAIN ALL

Selecting New and Old Events With ISTREAM and RSTREAM Keywords

The optional ISTREAM and RSTREAM keywords in the SELECT clause define the event stream posted to update listeners to the statement. If neither keyword is specified, the engine posts both insert and remove stream events to statement listeners. The insert stream consists of the events entering the respective window(s) or stream(s) or aggregations, while the remove stream consists of the events leaving the respective window(s) or the changed aggregation result. Insert and remove events are explained in more detail in Event Sinks.

By specifying the ISTREAM keyword you can instruct the engine to only post insert stream events to update listeners. The engine will then not post any remove stream events. By specifying the RSTREAM keyword you can instruct the engine to only post remove stream events to update listeners. The engine will then not post any insert stream events.

The following statement selects only the events that are leaving the 30 second time window.

  SELECT RSTREAM * 
FROM StockTick RETAIN 30 SECONDS

The ISTREAM and RSTREAM keywords in the SELECT clause are matched by same-name keywords available in the INSERT INTO clause as explained in INSERT INTO. While the keywords in the SELECT clause control the event stream posted to update listeners to the statement, the same keywords in the insert into clause specify the event stream that the engine makes available to other statements.

 


FROM

Either the FROM or the MATCHING clause is required in all EPL statements. The FROM clause specifies one or more event streams as the source of the event data. The MATCHING clause is discussed in MATCHING.

  FROM stream_expression [ inner_join | outer_join ] 

with inner_join specified as a comma separated list of stream expressions:

  (, stream_expression )*

and outer_join defined as:

((LEFT|RIGHT|FULL) OUTER JOIN stream_expression ON prop_name = prop_name)*

Inner joins are discussed in detail in Inner Joins while outer joins are discussed in Outer Joins.

A stream_expression may simply define the name of the event type used as the source of the stream data, or in more complex scenarios define either a subquery expression as a nested EPL statement or a parameterized SQL query to access JDBC data. In all of these cases, the stream_expression may optionally include an alias as an identifier to qualify any ambiguous property name references in other expressions and a RETAIN clause to define the window of stream data seen by the rest of the query:

  (stream_name | subquery_expr | param_sql_query) [[AS] alias]] [RETAIN retain_expr]
subquery_expr: ( epl_statement )
param_sql_query: database_name ('parameterized_sql_query')

The subquery_expr defines a subquery or nested EPL statement in parenthesis. A subquery is used to pre-filter event stream data seen by the outer EPL statement. For example, the following query would restrict the data seen by the outer EPL statement to only StockTick events coming from a Reuters feed.

  SELECT stockSymbol, AVG(price)
FROM (SELECT * FROM StockTick WHERE feedName = 'Reuters' )
RETAIN 1 MINUTE PARTITION BY stockSymbol
GROUP BY stockSymbol

Subqueries may be arbitrarily nested, but may not contain an INSERT INTO or an OUTPUT clause. Unlike with a top level EPL statement, a RETAIN clause is optional within a subquery. Subquery expressions are discussed in more detail in Subquery Expressions.

The param_sql_query specifies a parameterized SQL query in quotes surrounded by parenthesis that enables reference and historical data accessible through JDBC to be retrieved. The database_name identifies the name of the database over which the query will be executed. Configuration information is associated with this database name to establish a database connection, control connection creation and removal, and to setup caching policies for query results. Parameterized SQL queries are discussed in more detail in Parameterized SQL Queries.

The RETAIN clause defines the quantity of event data read from the streams listed in the FROM clause prior to query processing. Each stream may have its own RETAIN clause if each require different retain policies. Otherwise, the RETAIN clause may appear at the end of the FROM clause for it to apply to all streams. Essentially the RETAIN clause applies to all streams that appear before it in the FROM clause.

For example, in the following EPL statement, five StockTick events will be retained while three News events will be retained:

  SELECT t.stockSymbol, t.price, n.summary
FROM StockTick t RETAIN 5 EVENTS, News n RETAIN 3 EVENTS
WHERE t.stockSymbol = n.stockSymbol

However, in the following statement, four StockTick and four News events will be retained:

  SELECT t.stockSymbol, t.price, n.summary
FROM StockTick t, News n RETAIN 4 EVENTS
WHERE t.stockSymbol = n.stockSymbol

With the exception of subquery expressions, all stream sources must be constrained by a RETAIN clause. Thus at a minimum the FROM clause must contain at least one RETAIN clause at the end for top level EPL statements. The external data from parameterized SQL queries is not affected by the RETAIN clause. The RETAIN clause is discussed in more detail in RETAIN.

Inner Joins

Two or more event streams can be part of the FROM clause and thus both streams determine the resulting events. The WHERE clause lists the join conditions that EPL uses to relate events in two or more streams. If the condition is failed to be met, for example if no event data occurs for either of the joined stream source, no output will be produced.

Each point in time that an event arrives to one of the event streams, the two event streams are joined and output events are produced according to the where-clause.

This example joins two event streams. The first event stream consists of fraud warning events for which we keep the last 30 minutes. The second stream is withdrawal events for which we consider the last 30 seconds. The streams are joined on account number.

  SELECT fraud.accountNumber AS accntNum, 
fraud.warning AS warn, withdraw.amount AS amount,
MAX(fraud.timestamp, withdraw.timestamp) AS timestamp,
'withdrawlFraud' AS desc
FROM FraudWarningEvent AS fraud RETAIN 30 MIN,
WithdrawalEvent AS withdraw RETAIN 30 SEC
WHERE fraud.accountNumber = withdraw.accountNumber

Outer Joins

Left outer joins, right outer joins and full outer joins between an unlimited number of event streams are supported by EPL. Depending on the LEFT, RIGHT, or FULL qualifier, in the absence of event data from either stream source, output may still occur.

If the outer join is a left outer join, there will be an output event for each event of the stream on the left-hand side of the clause. For example, in the left outer join shown below we will get output for each event in the stream RfidEvent, even if the event does not match any event in the event stream OrderList.

  SELECT * 
FROM RfidEvent AS rfid
LEFT OUTER JOIN
OrderList AS orderlist
ON rfid.itemId = orderList.itemId
RETAIN 30 SECONDS

Similarly, if the join is a Right Outer Join, then there will be an output event for each event of the stream on the right-hand side of the clause. For example, in the right outer join shown below we will get output for each event in the stream OrderList, even if the event does not match any event in the event stream RfidEvent.

  SELECT * 
FROM RfidEvent AS rfid
RIGHT OUTER JOIN
OrderList AS orderlist
ON rfid.itemId = orderList.itemId
RETAIN 30 SECONDS

For all types of outer joins, if the join condition is not met, the select list is computed with the event properties of the arrived event while all other event properties are considered to be null.

  SELECT * 
FROM RfidEvent AS rfid
FULL OUTER JOIN
OrderList AS orderlist
ON rfid.itemId = orderList.itemId
RETAIN 30 SECONDS

The last type of outer join is a full outer join. In a full outer join, each point in time that an event arrives to one of the event streams, one or more output events are produced. In the example below, when either an RfidEvent or an OrderList event arrive, one or more output event is produced.

Subquery Expressions

A subquery expression is a nested EPL statement that appears in parenthesis in the FROM clause. A subquery may not contain an INSERT INTO clause or an OUTPUT clause, and unlike top level EPL statements, a RETAIN clause is optional.

Subquery expressions execute prior to their containing EPL statement and thus are useful to pre-filter event data seen by the outer statement. For example, the following query would calculate the moving average of a particular stock over the last 100 StockTick events:

  SELECT AVG(price)
FROM (SELECT * FROM StockTick WHERE stockSymbol = 'ACME' )
RETAIN 100 EVENTS

If the WHERE clause had been placed in the outer query, StockTick events for other stock symbols would enter into the window, reducing the number of events used to calculate the average price.

In addition, a subquery may be used to a) transform the structure of the inner event source to the structure required by the outer EPL statement or b) merge multiple event streams to form a single stream of events. This allows a single EPL statement to be used instead of multiple EPL statements with an INSERT INTO clause connecting them. For example, the following query merges transaction data from EventA and EventB and then uses the combined data in the outer query:

  SELECT custId, SUM(latency)
FROM (SELECT A.customerId AS custId, A.timestamp -B.timestamp AS latency
FROM EventA A, EventB B
WHERE A.txnId = B.txnId)
RETAIN 30 MIN
GROUP BY custId

A subquery itself may contain subqueries thus allowing arbitrary levels of nesting.

Parameterized SQL Queries

Parameterized SQL queries enable reference and historical data accessible through JDBC to be queried via SQL within EPL statements. In order for such data sources to become accessible to EPL, some configuration is required.

The following restrictions currently apply:

The query string is single or double quoted and surrounded by parentheses. The query may contain one or more substitution parameters. The query string is passed to your database software unchanged, allowing you to write any SQL query syntax that your database understands, including stored procedure calls.

Substitution parameters in the SQL query string take the form ${event_property_name}. The engine resolves event_property_name at statement execution time to the actual event property value supplied by the events in the joined event stream.

The engine determines the type of the SQL query output columns by means of the result set metadata that your database software returns for the statement. The actual query results are obtained via the getObject on java.sql.ResultSet.

The sample EPL statement below joins an event stream consisting of CustomerCallEvent events with the results of an SQL query against the database named MyCustomerDB and table Customer:

  SELECT custId, cust_name 
FROM CustomerCallEvent,
MyCustomerDB (' SELECT cust_name FROM Customer WHERE cust_id = ${custId} ')
RETAIN 10 MINUTES

The example above assumes that CustomerCallEvent supplies an event property named custId. The SQL query selects the customer name from the Customer table. The WHERE clause in the SQL matches the Customer table column cust_id with the value of custId in each CustomerCallEvent event. The engine executes the SQL query for each new CustomerCallEvent encountered.

If the SQL query returns no rows for a given customer id, the engine generates no output event. Else the engine generates one output event for each row returned by the SQL query. An outer join as described in the next section can be used to control whether the engine should generate output events even when the SQL query returns no rows.

The next example adds a time window of 30 seconds to the event stream CustomerCallEvent. It also renames the selected properties to customerName and customerId to demonstrate how the naming of columns in an SQL query can be used in the SELECT clause in the EQL query. The example uses explicit stream names via the AS keyword.

  SELECT customerId, customerName 
FROM CustomerCallEvent AS cce RETAIN 30 SECONDS,
MyCustomerDB
("SELECT cust_id AS customerId, cust_name AS customerName
FROM Customer WHERE cust_id = ${cce.custId}") AS cq

Any window, such as the time window, generates insert events as events enter the window, and remove events as events leave the window. The engine executes the given SQL query for each CustomerCallEvent in both the insert stream and the remove stream cases. As a performance optimization, the ISTREAM or RSTREAM keywords in the SELECT clause can be used to instruct the engine to only join insert or remove events, reducing the number of SQL query executions.

Parameterized SQL queries may be used in outer joins as well. Use a left outer join, such as in the next statement, if you need an output event for each event regardless of whether or not the SQL query returns rows. If the SQL query returns no rows, the join result populates null values into the selected properties.

  SELECT custId, custName 
FROM CustomerCallEvent AS cce
LEFT OUTER JOIN
MyCustomerDB
("SELECT cust_id, cust_name AS custName
FROM Customer WHERE cust_id = ${cce.custId}") AS cq
ON cce.custId = cq.cust_id
RETAIN 10 MINUTES

The statement above always generates at least one output event for each CustomerCallEvent, containing all columns selected by the SQL query, even if the SQL query does not return any rows. Note the ON expression that is required for outer joins. The ON acts as an additional filter to rows returned by the SQL query.

 


RETAIN

At least one RETAIN clause is a required in the FROM clause. The RETAIN clause applies to all stream sources listed in the FROM clause that precedes it. Conceptually it defines a window of event data for each stream source over which the query will be executed. The RETAIN clause has the following syntax:

  RETAIN
( ALL [EVENTS] ) |
( [BATCH OF]
( integer (EVENT|EVENTS) ) | ( time_interval (BASED ON prop_name)* )
( WITH [n] (LARGEST | SMALLEST | UNIQUE) prop_name )*
( PARTITION BY prop_name )* )

Each aspect of the RETAIN clause is discussed in detail below.

Keeping All Events

To keep all events for a stream source, specify the ALL [EVENTS] in the RETAIN clause.

  SELECT AVG(price) 
FROM StockTick RETAIN ALL EVENTS

In this case, the average price will be calculated based on all StockTick events that occur. Care must be taken with this option, however, since memory may run out when making calculations that require all or part of each event object to be retained under high volume scenarios. One such example would be in calculating a weighted average.

Specifying Window Size

The amount of event data to keep when running the query may be determined in two ways. The first option is to specify the maximum number of events kept. For example, the query below would keep a maximum of 100 StockTick events on which the average price would be computed:

  SELECT AVG(price) 
FROM StockTick RETAIN 100 EVENTS

As each new StockTick event comes in, the average price would be computed, with a maximum of 100 events being used for the calculation.

The second option is to specify the time interval in which to collect event data. For example, the query below would keep 1 minute's worth of StockTick events and compute the average price for this data:

  SELECT AVG(price) 
FROM StockTick RETAIN 1 MINUTE

In this case, as each new StockTick event comes in, again the average price would be computed. However, events that arrived more than one minute ago would be removed from the window with the average price being recalculated based on the remaining events in the window.

Specifying Batched Versus Sliding Windows

By default, the windows holding event data are sliding. With sliding windows, as a new event enters the window, an old events fall off the end of the window once the window is at capacity. Sliding windows cause the query to be re-executed as each new event enters and/or old event leaves the window. An alternative is to specify that the event data should be batched prior to query execution. Only when the window is full, is the query is executed. After this, new event data will again be collected until the window is once again full at which time the query will be re-executed.

For example, the query below would batch together 100 events prior to executing the query to compute the average price:

  SELECT AVG(price) 
FROM StockTick RETAIN BATCH OF 100 EVENTS

Once executed, it would batch the next 100 events together prior to re-executing the query.

For more detail on sliding versus batched windows, see Processing Model.

Specifying Time Interval

The time interval for the RETAIN clause may be specified in days, hours, minutes, seconds, and/or milliseconds:

  time_interval: [day-part][hour-part][minute-part][seconds-part][milliseconds-part]
  day-part: number ("days" | "day")
hour-part: number ("hours" | "hour" | "hr")
minute-part: number ("minutes" | "minute" | "min")
seconds-part: number ("seconds" | "second" | "sec")
milliseconds-part: number ("milliseconds" | "millisecond" | "msec" | "ms")

Some examples of time intervals are:

  10 seconds
10 minutes 30 seconds
20 sec 100 msec
0.5 minutes
1 day 2 hours 20 minutes 15 seconds 110 milliseconds

BASED ON Clause

By default, the elapse of a time interval is based on the internal system clock. However, in some cases, the time needs to be based on a timestamp value appearing as an event property. In this case, the BASED ON clause may be used to specify the property name containing a long-typed timestamp value. In this example, the StockTick events would be expected to have a timestamp property of type long whose value would control inclusion into and removal from the window:

  SELECT AVG(price) 
FROM StockTick RETAIN 1 MINUTE BASED ON timestamp

When using the BASED ON clause, each stream source listed in the FROM clause must have an associated timestamp property listed or WebLogic Event Server will throw an exception.

Specifying Property Name

A property may be referred to by simply using its property name within the RETAIN clause. However, if ambiguities exist because the same property name exists in more than one stream source in the FROM clause, it must be prefixed with its alias name followed by a period (similar to the behavior of properties referenced in the SELECT clause).

Using PARTION BY Clause to Partition Window

The PARTITION BY clause allows a window to be further subdivided into multiple windows based on the unique values contained in the properties listed. For example, the following query would keep 3 events for each unique stock symbol:

  SELECT stockSymbol, price 
FROM StockTick RETAIN 3 EVENTS PARTITION BY stockSymbol

Conceptually this is similar to the GROUP BY functionality in SQL or EPL. However, the PARTITION BY clause only controls the size and subdivision of the window and does not cause event data to be aggregated as with the GROUP BY clause. However, in most cases, the PARTITION BY clause is used in conjunction with the GROUP BY clause with same properties specified in both.

The following examples illustrate the interaction between PARTITION BY and GROUP BY. In the first example, with the absence of the PARTITION BY clause, a total of 10 events will be kept across all stock symbols.

  SELECT stockSymbol, AVG(price) 
FROM StockTick RETAIN 10 EVENTS
GROUP BY stockSymbol

The average price for each unique set of stock symbol will be computed based on these 10 events. If a stock symbol of AAA comes into the window, it may cause a different stock symbol such as BBB to leave the window. This would cause the average price for both the AAA group as well as the BBB group to change.

The second example includes the PARTITION BY clause and the GROUP BY clause.

  SELECT stockSymbol, AVG(price) 
FROM StockTick RETAIN 10 EVENTS PARTITION BY stockSymbol
GROUP BY stockSymbol

In this case, 10 events will be kept for each unique stock symbol. If a stock symbol of AAA comes into the window, it would only affect the sub-window associated with that symbol and not other windows for different stock symbols. Thus, in this case, only the average price of AAA would be affected.

Using WITH Clause to Keep Largest/Smallest/Unique Values

The WITH clause allows the largest, smallest, and unique property values to be kept in the window. For example, to keep the two highest priced stocks, the following statement would be used:

  SELECT stockSymbol, price 
FROM StockTick RETAIN 2 EVENTS WITH LARGEST price

In the case of time-based windows, the [n] qualifier before the LARGEST or SMALLEST keyword determines how many values are kept. For example, the following statement would keep the two smallest prices seen over one minute:

  SELECT stockSymbol, price 
FROM StockTick RETAIN 1 MINUTE WITH 2 SMALLEST price

In the absence of this qualifier, the single largest or smallest value is kept.

The UNIQUE qualifier causes the window to include only the most recent among events having the same value for the specified property. For example, the following query would keep only the last stock tick for each unique stock symbol:

  SELECT * 
FROM StockTick RETAIN 1 DAY WITH UNIQUE stockSymbol

Prior events of the same property value would be posted as old events by the engine.

 


MATCHING

Either a MATCHING or a FROM clause must appear in an EPL statement. The MATCHING clause is an alternate mechanism for determining which events are used by the EPL statement. It allows for the detection of a series of one or more events occurring that satisfies a specified pattern. Pattern expressions consist of references to streams separated by logical operators such as AND, OR, and FOLLOWED BY to define the sequence of events that compose the pattern. You may include an optional RETAIN clause, as specified in RETAIN, to define the characteristics of the window containing the matched events. The MATCHING clause executes prior to the WHERE or HAVING clauses.

The MATCHING clause syntax is as follows:

  MATCHING pattern_expression [RETAIN retain_clause]

with pattern_expression having the following syntax:

  [NOT|EVERY] stream_expression 
( ( AND | OR | [NOT] FOLLOWED BY ) stream_expression )*
[WITHIN time_interval]

You can use the NOT operator to detect the absence of an event and the EVERY operator to control how pattern matching continues after a match. The stream_expression is a stream source name optionally bound to a variable and filtered by a parenthesized expression:

  stream_expression: [var_name:=]stream_name [( filter_expression )]

Alternatively, a stream_expression may itself be a pattern_expression allowing for arbitrarily complex nesting of expressions:

The var_name is bound to the event object occurring that triggers the match. It may be referenced as any other event property in filter expressions that follow as well as in other clauses such as the SELECT and WHERE clauses. The stream_name may optionally be followed by a parenthesized expression to filter the matching events of that type. The expression act as a precondition for events to enter the corresponding window and has the same syntax as a WHERE clause expression. Previously bound variables may be used within the expression to correlate with already matched events.

The time_interval is a time interval as specified in Specifying Time Interval that follows the optional WITHIN keyword to determine how long to wait before giving up on the preceding expression to be met.

In the example below we look for RFIDEvent event with a category of "Perishable" followed by an RFIDError within 10 seconds whose id matches the ID of the matched RFIDEvent object.

  SELECT * 
MATCHING a:=RFIDEvent(category="Perishable")
FOLLOWED BY RFIDError(id=a.id) WITHIN 10 seconds
RETAIN 1 MINUTE

The following sections discuss the syntax, semantics, and additional operators available in the MATCHING clause to express temporal constraints for pattern matching.

FOLLOWED BY Operator

The FOLLOWED BY temporal operator matches on the occurrence of several event conditions in a particular order. It specifies that first the left hand expression must turn true and only then will the right hand expression be evaluated for matching events.

For example, the following pattern looks for event A and if encountered, looks for event B:

  A FOLLOWED BY B

This does not mean that event A must immediately be followed by event B. Other events may occur between the event A and the event B and this expression would still evaluate to true. If this is not the desired behavior, used the NOT operator as described in NOT Operator.

AND Operator

The AND logical operator requires both nested pattern expressions to turn true before the whole expression returns true. In the context of the MATCHING clause, the operator matches on the occurrence of several event conditions but not necessarily in a particular order.

For example, the following pattern matches when both event A and event B are found:

  A AND B

The pattern matches on any sequence of A followed by B in either order. In addition, it is not required that a B event immediately follow an A event - other events may appear in between the occurrence of an A event and a B event for this expression to return true.

OR Operator

The OR logical operator requires either one of the expressions to turn true before the whole expression returns true. In the context of the MATCHING clause, the operator matches on the occurrence of either of several event conditions but not necessarily in a particular order.

For example, the following pattern matches for either event A or event B:

  A OR B

The following would detect all stock ticks that are either above a certain price or above a certain volume.

  StockTick(price > 1.0) OR StockTick(volume > 1000)

NOT Operator

The NOT operator negates the truth value of an expression. In the context of the MATCHING clause, the operator allows the absence of an event condition to be detected.

The following pattern matches only when an event A is encountered followed by event B but only if no event C was encountered before event B.

  ( A FOLLOWED BY B ) AND NOT C

EVERY Operator

The EVERY operator indicates that the pattern sub-expression should restart when the sub-expression qualified by the EVERY keyword evaluates to true or false. In the absence of the EVERY operator, an implicit EVERY operator is inserted as a qualifier to the first event stream source found in the pattern not occurring within a NOT expression.

The EVERY operator works like a factory for the pattern sub-expression contained within. When the pattern sub-expression within it fires and thus quits checking for events, the EVERY causes the start of a new pattern sub-expression listening for more occurrences of the same event or set of events.

Every time a pattern sub-expression within an EVERY operator turns true the engine starts a new active sub-expression looking for more event(s) or timing conditions that match the pattern sub-expression.

Let's consider an example event sequence as follows:

A1 B1 C1 B2 A2 D1 A3 B3 E1 A4 F1 B4

Example
Description
EVERY ( A FOLLOWED BY B )
Detect event A followed by event B. At the time when B occurs the pattern matches, then the pattern matcher restarts and looks for event A again.
  1. Matches on B1 for combination {A1, B1}.
  2. Matches on B3 for combination {A2, B3}.
  3. Matches on B4 for combination {A4, B4}
EVERY A FOLLOWED BY B
The pattern fires for every event A followed by an event B.
  1. Matches on B1 for combination {A1, B1}
  2. Matches on B3 for combination {A2, B3} and {A3, B3}.
  3. Matches on B4 for combination {A4, B4}
EVERY A FOLLOWED BY EVERY B
The pattern fires for every event A followed by every event B, in other words, all combinations of A followed by B.
  1. Matches on B1 for combination {A1, B1}.
  2. Matches on B2 for combination {A1, B2}.
  3. Matches on B3 for combination {A1, B3}, {A2, B3} and {A3, B3}.
  4. Matches on B4 for combination {A1, B4}, {A2, B4}, {A3, B4}, and {A4, B4}

The examples show that it is possible that a pattern fires for multiple combinations of events that match a pattern expression.

Let's consider the EVERY operator in conjunction with a sub-expression that matches three events that follow each other:

  EVERY (A FOLLOWED BY B FOLLOWED BY C)

The pattern first looks for event A. When event A arrives, it looks for event B. After event B arrives, the pattern looks for event C. Finally, when event C arrives the pattern matches. The engine then starts looking for event A again.

Assume that between event B and event C a second event A2 arrives. The pattern would ignore the A2 entirely since it's then looking for event C. As observed in the prior example, the EVERY operator restarts the sub-expression A FOLLOWED BY B FOLLOWED BY C only when the sub-expression fires.

In the next statement the every operator applies only to the A event, not the whole sub-expression:

  EVERY A FOLLOWED BY B FOLLOWED BY C

This pattern now matches for any event A that is followed by an event B and then event C, regardless of when the event A arrives. This can often be impractical unless used in combination with the AND NOT syntax or the RETAIN syntax to constrain how long an event remains in the window.

WITHIN Operator

The WITHIN qualifier acts like a stopwatch. If the associated pattern expression does not become true within the specified time period it is evaluated by the engine as false. The WITHIN qualifier takes a time period as a parameter as specified in Specifying Time Interval.

This pattern fires if an A event arrives within 5 seconds after statement creation.

  A WITHIN 5 SECONDS

This pattern fires for all A events that arrive within 5 second intervals.

This pattern matches for any one A or B event in the next 5 seconds.

  (A or B) WITHIN 5 SECONDS

This pattern matches for any two errors that happen 10 seconds within each other.

  A(status='ERROR') FOLLOWED BY B(status='ERROR') WITHIN 10 SECONDS

This pattern matches when a Status event does not occur within 10 seconds:

  NOT Status WITHIN 10 SECONDS

Event Structure for Matched Pattern

The structure of the events produced when a pattern matches is determined by the structure of the union of the variables bound within the MATCHING clause. Thus variable bindings must be present in order to retrieve data from the matched events.

For example, given the following pattern:

  tick:=StockTick FOLLOWED BY news:=News(stockSymbol = tick.stockSymbol)

Events that match would have a composite event type with two properties: a tick property with a type of StockTick and a news property with a type of News.

 


WHERE

The WHERE clause is an optional clause in EPL statements. Using the WHERE clause event streams can be joined and events can be filtered. Aggregate functions may not appear in a WHERE clause. To filter using aggregate functions, the HAVING clause should be used.

  WHERE aggregate_free_expression

Comparison operators =, <, >, >=, <=, !=, <>, IS NULL, IS NOT NULL and logical combinations using AND and OR are supported in the WHERE clause. Some examples are listed below.

  ...WHERE fraud.severity = 5 AND amount > 500
  ... WHERE (orderItem.orderId IS NULL) OR (orderItem.class != 10)
  ... WHERE (orderItem.orderId = NULL) OR (orderItem.class <> 10)
  ... WHERE itemCount / packageCount > 10

 


GROUP BY

The GROUP BY clause is optional in EPL statements. The GROUP BY clause divides the output of an EPL statement into groups. You can group by one or more event property names, or by the result of computed expressions. When used with aggregate functions, GROUP BY retrieves the calculations in each subgroup. You can use GROUP BY without aggregate functions, but generally this can produce confusing results.

For example, the below statement returns the total price per symbol for all StockTickEvents in the last 30 seconds:

  SELECT symbol, SUM(price) 
FROM StockTickEvent RETAIN 30 SEC
GROUP BY symbol

The syntax of the GROUP BY clause is:

  GROUP BY arregate_free_expression [, arregate_free_expression] [, …]

EPL places the following restrictions on expressions in the GROUP BY clause:

You can list more then one expression in the GROUP BY clause to nest groups. Once the sets are established with GROUP BY, the aggregation functions are applied. This statement posts the median volume for all stock tick events in the last 30 seconds grouped by symbol and tick data feed. EPL posts one event for each group to statement update listeners:

  SELECT symbol, tickDataFeed, MEDIAN(volume)
FROM StockTickEvent RETAIN 30 SECONDS
GROUP BY symbol, tickDataFeed

In the statement above the event properties in the select list (symbol and tickDataFeed) are also listed in the GROUP BY clause. The statement thus follows the SQL standard which prescribes that non-aggregated event properties in the select list must match the GROUP BY columns.

EPL also supports statements in which one or more event properties in the select list are not listed in the GROUP BY clause. The statement below demonstrates this case. It calculates the standard deviation for the last 30 seconds of stock ticks aggregating by symbol and posting for each event the symbol, tickDataFeed and the standard deviation on price.

  SELECT symbol, tickDataFeed, STDDEV(price)
FROM StockTickEvent RETAIN 30 SECONDS
GROUP BY symbol

The above example still aggregates the price event property based on the symbol, but produces one event per incoming event, not one event per group.

Additionally, EPL supports statements in which one or more event properties in the GROUP BY clause are not listed in the select list. This is an example that calculates the mean deviation per symbol and tickDataFeed and posts one event per group with symbol and mean deviation of price in the generated events. Since tickDataFeed is not in the posted results, this can potentially be confusing.

  SELECT symbol, AVEDEV(price)
FROM StockTickEvent RETAIN 30 SECONDS
GROUP BY symbol, tickDataFeed

Expressions are also allowed in the GROUP BY list:

  SELECT symbol * price, count(*)
FROM StockTickEvent RETAIN 30 SECONDS
GROUP BY symbol * price

If the GROUP BY expression results in a null value, the null value becomes its own group. All null values are aggregated into the same group. The COUNT(expression) aggregate function does not count null values and the COUNT returns zero if only null values are encountered.

You can use a WHERE clause in a statement with GROUP BY. Events that do not satisfy the conditions in the WHERE clause are eliminated before any grouping is done. For example, the statement below posts the number of stock ticks in the last 30 seconds with a volume larger then 100, posting one event per group (symbol).

  SELECT symbol, count(*)
FROM StockTickEvent RETAIN 30 SECONDS
WHERE volume > 100
GROUP BY symbol

 


HAVING

The HAVING clause is optional in EPL statements. Use the HAVING clause to pass or reject events defined by the GROUP BY clause. The HAVING clause sets conditions for the GROUP BY clause in the same way WHERE sets conditions for the SELECT clause, except the WHERE clause cannot include aggregate functions, while HAVING often does.

  HAVING expression

This statement is an example of a HAVING clause with an aggregate function. It posts the total price per symbol for the last 30 seconds of stock tick events for only those symbols in which the total price exceeds 1000. The HAVING clause eliminates all symbols where the total price is equal or less then 1000.

  SELECT symbol, SUM(price)
FROM StockTickEvent RETAIN 30 SEC
GROUP BY symbol
HAVING SUM(price) > 1000

To include more than one condition in the HAVING clause combine the conditions with AND, OR or NOT. This is shown in the statement below which selects only groups with a total price greater then 1000 and an average volume less then 500.

  SELECT symbol, SUM(price), AVG(volume)
FROM StockTickEvent RETAIN 30 SEC
GROUP BY symbol
HAVING SUM(price) > 1000 AND AVG(volume) < 500

EPL places the following restrictions on expressions in the HAVING clause:

A statement with the HAVING clause should also have a GROUP BY clause. If you omit GROUP BY, all the events not excluded by the WHERE clause return as a single group. In that case HAVING acts like a WHERE except that HAVING can have aggregate functions.

The HAVING clause can also be used without GROUP BY clause as the below example shows. The example below posts events where the price is less then the current running average price of all stock tick events in the last 30 seconds.

  SELECT symbol, price, AVG(price)
FROM StockTickEvent RETAIN 30 SEC
HAVING price < AVG(price)

Interaction With MATCHING, WHERE and GROUP BY Clauses

When an EPL statement includes subqueries, a MATCHING clause, WHERE conditions, a GROUP BY clause, and HAVING conditions, the sequence in which each clause executes determines the final result:

  1. Any subqueries present in the statement run first. The subqueries act as a filter for events to enter the window of the outer query
  2. The event stream's filter conditions in the MATCHING clause, if present, dictates which events enter a window. The filter discards any events not meeting filter criteria.
  3. The WHERE clause excludes events that do not meet its search condition.
  4. Aggregate functions in the SELECT list calculate summary values for each group.
  5. The HAVING clause excludes events from the final results that do not meet its search condition.

The following query illustrates the use of filter, WHERE, GROUP BY and HAVING clauses in one statement with a SELECT clause containing an aggregate function.

  SELECT tickDataFeed, STDDEV(price)
FROM (SELECT * FROM StockTickEvent WHERE symbol='ACME')
RETAIN 10 EVENTS
WHERE volume > 1000
GROUP BY tickDataFeed
HAVING STDDEV(price) > 0.8

EPL filters events using the subquery for the event stream StockTickEvent. In the example above, only events with symbol ACME enter the window over the last 10 events, all other events are simply discarded. The WHERE clause removes any events posted into the window (events entering the window and event leaving the window) that do not match the condition of volume greater then 1000. Remaining events are applied to the STDDEV standard deviation aggregate function for each tick data feed as specified in the GROUP BY clause. Each tickDataFeed value generates one event. EPL applies the HAVING clause and only lets events pass for tickDataFeed groups with a standard deviation of price greater then 0.8.

 


ORDER BY

The ORDER BY clause is optional in EPL. It is used for ordering output events by their properties, or by expressions involving those properties. For example, the following statement batches 1 minute of stock tick events sorting them first by price and then by volume.

  SELECT symbol 
FROM StockTickEvent RETAIN BATCH OF 1 MINUTE
ORDER BY price, volume

Here is the syntax for the ORDER BY clause:

  ORDER BY expression [ASC | DESC] [, expression [ASC | DESC] [,…]]

EPL places the following restrictions on the expressions in the ORDER BY clause:

Otherwise, any kind of expression that can appear in the SELECT clause, as well as any alias defined in the SELECT clause, is also valid in the ORDER BY clause.

 


OUTPUT

The OUTPUT clause is optional in EPL and is used to control or stabilize the rate at which events are output. For example, the following statement batches old and new events and outputs them at the end of every 90 second interval.

  SELECT * 
FROM StockTickEvent RETAIN 5 EVENTS
OUTPUT EVERY 90 SECONDS

Here is the syntax for output rate limiting:

  OUTPUT [ALL | ( (FIRST | LAST) [number]] EVERY number [EVENTS | time_unit]

where

  time_unit: MIN | MINUTE | MINUTES | SEC | SECOND | SECONDS | MILLISECONDS | MS

The ALL keyword is the default and specifies that all events in a batch should be output. The batch size can be specified in terms of time or number of events.

The FIRST keyword specifies that only the first event in an output batch is to be output. The optional number qualifier allows more than one event to be output. The FIRST keyword instructs the engine to output the first matching event(s) as soon as they arrive, and then ignore matching events for the time interval or number of events specified. After the time interval elapsed, or the number of matching events has been reached, the same cycle starts again.

The LAST keyword specifies to only output the last event at the end of the given time interval or after the given number of matching events have been accumulated. The optional number qualifier allows more than one event to be output.

The time interval can also be specified in terms of minutes or milliseconds; the following statement is identical to the first one.

  SELECT * 
FROM StockTickEvent RETAIN 5 EVENTS
OUTPUT EVERY 1.5 MINUTES

A second way that output can be stabilized is by batching events until a certain number of events have been collected. The next statement only outputs when either 5 (or more) new or 5 (or more) old events have been batched.

  SELECT * 
FROM StockTickEvent RETAIN 30 SECONDS
OUTPUT EVERY 5 EVENTS

Additionally, event output can be further modified by the optional LAST keyword, which causes output of only the last event(s) to arrive into an output batch. For the example below, the last five events would be output every three minutes.

  SELECT * 
FROM StockTickEvent RETAIN 30 SECONDS
OUTPUT LAST 5 EVERY 3 MINUTES

Using the FIRST keyword you can be notified at the start of the interval. This allows one to be immediately notified each time a rate falls below a threshold.

  SELECT * 
FROM TickRate RETAIN 30 SECONDS
WHERE rate < 100
OUTPUT FIRST EVERY 60 SECONDS

Interaction With GROUP BY and HAVING Clauses

The OUTPUT clause interacts in two ways with the GROUP BY and HAVING clauses. First, in the OUTPUT EVERY n EVENTS case, the number n refers to the number of events arriving into the GROUP BY clause. That is, if the GROUP BY clause outputs only 1 event per group, or if the arriving events do not satisfy the HAVING clause, then the actual number of events output by the statement could be fewer than n.

Second, the LAST and ALL keywords have special meanings when used in a statement with aggregate functions and the GROUP BY clause. The LAST keyword specifies that only groups whose aggregate values have been updated with the most recent batch of events should be output. The ALL keyword (the default) specifies that the most recent data for all groups seen so far should be output, whether or not these groups' aggregate values have just been updated.

 


INSERT INTO

The INSERT INTO clause is optional in EPL. This clause can be specified to make the results of a statement available as an event stream for use in further statements. The clause can also be used to merge multiple event streams to form a single stream of events.

  INSERT INTO CombinedEvent
SELECT A.customerId AS custId, A.timestamp - B.timestamp AS latency
FROM EventA A, EventB B RETAIN 30 MIN
WHERE A.txnId = B.txnId

The INSERT INTO clause in the above statement generates events of type CombinedEvent. Each generated CombinedEvent event has two event properties named custId and latency. The events generated by the above statement can be used in further statements. For example, the statement below uses the generated events.

  SELECT custId, SUM(latency)
FROM CombinedEvent RETAIN 30 MIN
GROUP BY custId

The INSERT INTO clause can consist of just an event type alias, or of an event type alias and one or more event property names. The syntax for the INSERT INTO clause is as follows:

  INSERT [ISTREAM | RSTREAM] INTO event_type_alias [(prop_name [,prop_name, [,…]] ) ]

The ISTREAM (default) and RSTREAM keywords are optional. If neither keyword is specified, the engine supplies the insert stream events generated by the statement to attached update listeners. The insert stream consists of the events entering the respective window(s) or stream(s). If the RSTREAM keyword is specified, the engine supplies the remove stream events generated by the statement. The remove stream consists of the events leaving the respective window(s).

The event_type_alias is an identifier that names the events generated by the engine. The identifier can be used in statements to filter and process events of the given name.

The engine also allows update listeners to be attached to a statement that contain an INSERT INTO clause.

To merge event streams, simply use the same event_type_alias identifier in any EPL statements that you would like to be merged. Make sure to use the same number and names of event properties and that event property types match up.

EPL places the following restrictions on the INSERT INTO clause:

The example statement below shows the alternative form of the INSERT INTO clause that explicitly defines the property names to use.

  INSERT INTO CombinedEvent (custId, latency)
SELECT A.customerId, A.timestamp - B.timestamp
FROM EventA A, EventB B RETAIN 30 MIN
WHERE A.txnId = B.txnId

The RSTREAM keyword is used to indicate to the engine to generate only remove stream events. This can be useful if we want to trigger actions when events leave a window rather then when events enter a window. The statement below generates CombinedEvent events when EventA and EventB leave the window after 30 minutes.

  INSERT RSTREAM INTO CombinedEvent
SELECT A.customerId AS custId, A.timestamp - B.timestamp AS latency
FROM EventA A, EventB B RETAIN 30 MIN
WHERE A.txnId = B.txnId

  Back to Top       Previous  Next