Filtering out rows as soon as possible improves query latency
because it reduces the amount of data that must be tracked through the
evaluator.
Consider the following two versions of a
query. The first form of the query first groups records by
g, passes each group through the filter
(b < 10), and then accumulates the records that
remain. The input records are not filtered, and the grouping operation must
operate on all input records.
RETURN Result AS SELECT
SUM(a) WHERE (b < 10) AS sum_a_blt10
GROUP BY g
The second form of the query filters the input (with the
WHERE clause) before the records are passed to the
grouping operation. Thus the grouping operation must group only those records
of interest to the query. By eliminating records that are not of interest
sooner, evaluation will be faster.
RETURN Results AS SELECT
SUM(a) AS sum_a_blt10,
WHERE (b < 10)
GROUP BY g
Another example of filtering records early is illustrated with the
following pair of queries. Recall that a
WHERE clauses filters input records and a
HAVING clause filters output records. The first query
computes the sum for all values of
g and (after performing all of that computation) throws
away all results that do not meet the condition
(g < 10).
RETURN Result AS SELECT
SUM(a) AS sum_a
GROUP BY g
HAVING g < 10
The second query, on the other hand, first filters the input records to
only those in the interesting groups. It then aggregates only those interesting
groups.
RETURN Result AS SELECT
SUM(a) AS sum_a
WHERE g < 10
GROUP BY g