ORDER BY

SQL’s ORDER BY orders the output data. Similarly, the PartiQL ORDER BY is responsible for turning its input bag into an array. In the following aspects, PartiQL extends the SQL semantics to resolve issues that are not relevant in SQL but emerge when working on Ion data.

  1. SQL’s ORDER BY clause orders its input using an expanded version of the less-than function, which we call the order-by less-than and denote by \$<^o\$. The PartiQL \$<^o\$ semantics (Ordering Function) also specify an order among values of heterogeneous types, including complex values.

  2. The interaction of ORDER BY with a UNION (or any other set operator) of SFW queries requires attention since, unlike SQL, in PartiQL there are no binding tuples (or any tuples at all for that matter) after a SELECT VALUE clause. ORDER BY with UNION, INTERSECT, EXCEPT

  3. Unlike SQL, the input of an PartiQL query may also have order, because it is an array. The user may want to preserve the order of the input into the output. In this case, the AT structure in the FROM clause (recall, single-item FROM) can capture the input order and the ORDER BY can recreate it. However, this order preservation mechanism is tedious for the user. Thus, ORDER BY also offers an order preservation directive.

SQL Compatibility and SELECT Clause Variables in ORDER BY discuss SQL compatibility issues.

Similar to SQL, the PartiQL ORDER BY clause syntax is:

EBNF Grammar for PartiQL Names
order-by-clause ::= 'ORDER' 'BY' (sort-spec-list|'PRESERVE')

sort-spec-list ::= sort-spec (',' sort-spec)*

sort-spec ::= expression ('ASC'|'DESC')? ('NULLS' (FIRST|LAST))?

In PartiQL a SFW query with ORDER BY outputs an array, whereas a SFW query without ORDER BY outputs a bag.

Alike SQL’s ORDER BY clause, the NULLS FIRST and NULLS LAST keywords indicate whether NULL and MISSING values are ordered before or after all other values. Notice that in PartiQL, the NULLS FIRST and NULLS LAST refer to both NULL and MISSING.

Ordering Function

The ORDER BY clause sorts its input using the order-by less-than function \$<^o\$, which is able to compare values of different types (unlike SQL). In particular:

  1. NULL and MISSING are always first or last and compare equally according to \$<^o\$. In other words, \$<^o\$ cannot distinguish between NULL and MISSING.

  2. The boolean values are coming first among the non-absent values (i.e., \$b <^o x\$ is always true if \$b\$ is boolean and \$x\$ is not a NULL or a MISSING or a boolean). false comes before true.

  3. The numbers come next. The comparisons between number values do not depend on precision or specific type. Given two numbers \$x\$ and \$y\$, the PartiQL \$x<^o y\$ behaves identical to the SQL order-by less-than function. Namely, if \$x\$ and \$y\$ are not the special values -inf, inf, or nan, then \$x <^o y\$ is the same with \$x < y\$. The special value nan comes before -inf, which comes before all normal numeric values, which are followed by +inf.

  4. Timestamp values follow and are compared by the absolute point of time irrespective of precision or local UTC offset.

  5. The text types come next ordered by their lexicographical ordering by Unicode scalar irrespective of their specific type.

  6. The LOB types follow and are ordered by their lexicographical ordering by octet.

  7. Arrays come next, and their values compare lexicographically based on the comparison of their elements, recursively. Notice that given an array \$ \[e_1,..., e_m ] \$ and a longer array \$ \[e_1, ..., e_m, e_{m+1}, ..., e_n ] \$ that has the same first \$m\$ values, the former array comes first.

  8. Tuple values follow and compare lexicographically based on the sorted attributes (as defined recursively), first by the attribute name, and secondly by the attribute values themselves.

  9. Bag values come last (except, of course, when NULLS LAST is specified) and their values compare by first reducing them to arrays by sorting their elements and then comparing the resulting arrays.

ORDER BY with UNION, INTERSECT, EXCEPT

Coming up…​

SQL Compatibility

For SQL-compatibility, PartiQL allows the CURRENT variable to be omitted from ordering expressions. Then when the CURRENT variable binds tuples, the ordering expressions can refer directly to the attributes of those tuples.

The complete scoping rules are as follows. When all of the following conditions are satisfied:

  1. an PartiQL path expression ordering expression \$as\$ appears in the ORDER BY of a UNION …​ ORDER BY query, where \$a\$ is an identifier and \$s\$ is the potentially empty suffix of the path.

  2. the expression \$as\$ is evaluated in database environment \$p_0\$ and variables’ environment \$p\$, which defines variables \$v_1,...,v_n\$ and none of them is named \$a\$.

  3. none of the variables \$v_1,...,v_n\$ may bind to a tuple that has an attribute \$a\$.

then the path expression as resolves to CURRENT.as.

The most common and useful way to have the 3rd condition be satisfied is when the UNION …​ ORDER BY is a top-level query and, thus, the variables environment \$p\$ is empty.

SELECT Clause Variables in ORDER BY

Recall that ORDER BY is evaluated before SELECT. For SQL-compatibility, given SELECT e AS a, PartiQL also supports the syntactic sugar of using a in lieu of e in the ORDER BY clause. Therefore, both SFW queries below are equivalent:

SELECT e AS a
FROM ...
ORDER BY a

\$<=>\$

SELECT e AS a
FROM ...
ORDER BY e

Coercion of Literals

Notice that definition of < dismissed the SQL coercions. In SQL, given explicit literals in a query, coercions may happen.

Example 1.  

The query

SELECT * FROM foo WHERE 9 < '10'

is equivalent to

SELECT * FROM foo WHERE 9 < 10

because an automatic coercion of string to number will be introduced.

This aspect of SQL compatibility is introduced by rewriting. Namely, given a query with incompatible types