The GROUP/GROUP BY clause specifies how to map source records to result records to group statement output.
There are three ways to use this clause in a query:
You can use a GROUP clause to aggregate results into a single bucket.
RETURN "ReviewCount" AS SELECT SUM(number_of_reviews) AS "NumReviews" GROUP
This statement returns one record for NumReviews. The value is the sum of the values for the attribute number_of_reviews.
Grouping is allowed on source and locally defined attributes.
All grouping attributes are part of the result records. A NULL value in any grouping attribute causes the source record to map to no result records. This is different from SQL, which treats NULL like any other value. For information about user-defined NULL-value handling in EQL, see COALESCE.
You can use GROUP BY to aggregate results into buckets with common values for the grouping keys.
{ TransId, ProductType, Amount, Year, Quarter, Region, SalesRep, Customer }For example:
{ TransId = 1, ProductType = "Widget", Amount = 100.00, Year = 2011, Quarter = "11Q1", Region = "East", SalesRep = "J. Smith", Customer = "Customer1" }If an EQL statement uses Region and Year as GROUP BY attributes, the statement results contain an aggregated record for each valid, non-empty combination of Region and Year. In EQL, this example is expressed as:
DEFINE RegionsByYear AS GROUP BY Region, Yearresulting in the aggregates of the form { Region, Year }, for example:
{ "East", "2010" } { "West", "2011" } { "East", "2011" }
A GROUP BY key can be the output of a SELECT expression, as long as that expression itself does not contain an aggregation function.
SELECT COALESCE(Person, 'Unknown Person') as Person2, ... GROUP BY Person2
SELECT SUM(Sales) as Sales2, ... GROUP BY Sales2
You can group by a specified depth of each managed attribute.
If you group results by a managed attribute, you can specify a hierarchy depth at which to group, using the syntax:
GROUP BY ManagedAttr:<level>
GROUP BY "Region":1
If you group by a multi-assign attribute, each source record will map to multiple corresponding output records. For example, the record [A:1, A:2, B:3, B:4, B:5] will map to:
This can only occur with a corpus source, because result records are always single assign.
In this example, UserTag is multi-assign:
RETURN "Example" AS SELECT AVG("Gross") AS "AvgGross", SUM("Gross") AS "TotalGross", GROUP BY UserTag
To define the set of resulting buckets, a statement must specify a set of GROUP BY attributes. The cross product of all values in these grouping attributes defines the set of candidate buckets.
The results are automatically pruned to include only non-empty buckets.
If an attribute reference appears in a statement with a GROUP clause in the definition of an attribute not in the GROUP clause, the attribute will have an implicit ARB aggregate applied.