ARQ includes support for GROUP BY and counting.
This involves is syntactic extension and is available if the query is parsed
with language Syntax.syntaxARQ
.
A GROUP BY
clause transforms a result set so that only one row
will appear for each unique set of grouping variables. All other variables from
the query pattern are projected away and are not available in the SELECT
clause.
PREFIX SELECT ?p ?q { . . . } GROUP BY ?p ?q
SELECT *
will include variables from the GROUP BY
but no others. This ensures that results are always the same - including other
variables from the pattern would involve choosing some value that was not
constant across each section of the group and so lead to indeterminate results.
The GROUP BY
clause can involve an expression. If the expression
is named, then the value is included in the columns, before projection. An
unnamed expression is used for grouping but the value is not placed in the
result set formed by the GROUP BY
clause.
PREFIX SELECT ?productId ?cost { . . . } GROUP BY ?productId (?num * ?price AS ?cost)
A query may specify a HAVING clause to apply a filter to the result set after
grouping. The filter may involve variables from the GROUP BY
clause
or aggregations.
PREFIX SELECT ?p ?q { . . . } GROUP BY ?p ?q HAVING (count(distinct *) > 1)
Currently supported aggregations:
Aggregator |
Description |
---|---|
|
Count rows of each group element, or the
whole result set if no |
|
Count the distinct rows of each group
element, or the whole result set if no |
|
Count the number of times |
|
Count the number of distinct values
|
When a variable is used, what is being counted is occurrences of RDF terms, that is names. It is not a count of individuals because two names can refer to the same individual.
If there was no explicit GROUP BY
clause, then it is as if the
whole of the result set forms a single group element. Equivalently, it is
GROUP BY
of no variables. Only aggregation expressions make sense
in the SELECT clause as theer are no variables from the query pattern to project
out.