Variable Resolution
Resolution Rules
A PartiQL query is compiled within a static environment that associates global identifiers with their respective PartiQL types. These types are used during variable resolution to determine if an identifier path possibly refers to a certain value. The word "possible" is key to understanding how PartiQL is able to resolve variables given partial schema. The PartiQL Planner follows the given rules to resolve variables.
-
The path
@i1.i2. . . . .in
always refers to a bound variable namedi1
; if there is no such possible variable andi1.i2. . . . .im
,m ≤ n
is a database name theni1.i2. . . . .im
refers to that database name. If there is a choice, choose the largestm
. If both the resolution to variable and the resolution to database name fail, returnMISSING
or fail execution. -
If
i1.i2. . . . .in
is a FROM path andi1.i2. . . . .im
,m ≤ n
is a possible database name, theni1.i2. . . . .im
refers to that database name andim+1. . . . .in
is a series of tuple path navigations starting from the database namei1.i2. . . . .im
. If there is a choice, choose the largestm
. If there is no such possible database name, theni1.i2. . . . .im
refers to a variable (matching largestm
). If both the resolution to a database name and the resolution to a variable fail, then fail compilation as this identifier path cannot be resolved. -
If
i1.i2. . . . .in
is a non-FROM clause expression andi1
is an environment variable theni1
refers to such variable; if there is no such variable andi1.i2. . . . .im
,m ≤ n
is a database name, theni1.i2. . . . .im
refers to that database name. If there is a choice, choose the largestm
. If both the resolution to variable and the resolution to database name fail, returnMISSING
or fail execution.
Static Environment
For example, let’s see how x
and T
are resolved given different static environments.
SELECT x FROM T
Assume all static environments are closed.
env0 :: {
T : << closed :: { x: any } >> — bag of closed structs, each with known field x
}
env1 :: {
T : << closed :: { y: any } >> — bag of closed structs, each with known field y
}
env2 :: {
T : << open :: { y: any } >> — bag of open structs, each with known field y
}
env3 :: {
T : << closed :: { x: any } >>, — bag of closed structs, each with known fields x
x : int — int
}
env4 :: {
T : << closed :: { y: any } >>, — bag of closed structs, each with known fields y
x : int — int
}
env5 :: {
T : << open :: { y: any } >>, — bag of open structs, each with known fields y
x : int — int
}
For all environments we have a known global with name T
. In the example query, the identifier path T
is on the right-hand-side of a FROM
clause (ie it’s a FROM path) so we apply resolution rule 2. In all cases, this from path T
is resolved to the known global T
of the static environment.
Now let’s look how we resolve x
in the various environments.
Env0
In the query, the identifier path x
is unambiguously bound to T
. This enables us to attempt to resolve the query as
SELECT t.x FROM T as t
We apply rule 3 and find that t.x
is possible because the static environment tells us the structs of T
have a known field x
.
Env1
Unlike env0
, we cannot resolve the path t.x
via Rule 3 because the static environments tells us that it is impossible for a struct of T
to have a field x
. Next, Rule 3 tells us to check for a global variable x
, but again we find it’s not possible. This variable is unable to be resolved, and compilation fails.
Env2
Env2 differs from Env1 because the structs within bag T
are defined as having open schema. While we don’t explicitly know that x
is a known field of structs in T
, we do know that it’s possible for such a field to exist. In this case, we resolve the variable x
to the path t.x
.
Env3
The case for env3 is the same for env0. The static environment tells us the structs of T
have a known field x
.
Resolution with Multiple FROM sources
For example, let’s see how x
and T
are resolved given different static environments.
SELECT x FROM T, S
Assume all static environments are closed.
env0 :: {
T : << closed :: { x: any } >> — bag of closed structs, each with known field x
S : << closed :: { y: any } >> — bag of closed structs, each with known field x
}
env1 :: {
T : << closed :: { z: any } >> — bag of closed structs, each with known field z
S : << closed :: { z: any } >> — bag of closed structs, each with known field z
x : int — Global x of type int
}
Variable Scopes Example
The scoping rules discussed in the present section discuss the resolution of naming conflicts between names defined in the database environment and the variables of the environment variables. The potential for such naming conflicts is driven by the nested data of PartiQL, as illustrated next.
Notice there are a few more naming conventions, pertaining to the use
of attribute names defined in the SELECT
clause into the GROUP BY
and ORDER BY
clause. These conventions are explained with
the semantics of the respective clauses (see GROUP BY Clause and ORDER BY Clause).
The following example illustrates how SQL compatibility issues and the
needs of navigating into nested data need to be carefully merged
together. Consider the following database that has a table c
, i.e. a
collection of tuples, and also named data x.n
and y
.
t.c: <<
{'a':1, 'n':[{'b':11, 'c':12}]},
{'a':2, 'n':[{'b':21, 'c':22}]}
>>
x.n : << {'b':3} >>
y: {'a':1, 'b':2}
Then consider the query
SELECT t.a
FROM t.c AS x
WHERE x.a IN (SELECT y.b FROM x.n AS y)
This query poses many scoping issues:
-
Does
x.n
refer to the named valuex.n
or to then
attribute of the variablex
? For SQL compatibility purposes it refers to the named valuex.n
. Read below how to refer to the variablex
. -
Does
y.b
refer to theb
attribute of they
attribute or to theb
attribute of the named valuey
? For SQL compatibility purposes it refers to theb
attribute of the variabley
.
Notice how SQL compatibility required the database environment to take
priority over the variables environment in the FROM
clause and then, vice
versa, the variables environment to take priority over the database
environment in the SELECT
clause.
Assume database names coll
, v.foo
, w
. Then in the query
SELECT v.foo
FROM coll AS v, @v.foo AS w,
(SELECT w.a, u.b FROM @w.bar AS u)
AS x
coll
refers to the database name. The v
in @v.foo
refers to the
variable v
. If the @
were not there, v.foo
would refer to the database
name v.foo
. The w
in w.a
refers to the variable defined in line 2.
Note, the expressions coll
and @v.foo
are FROM
clause
expressions because they appear in the FROM
clause of the
sfw_query of lines 1-4, in which they are immediately nested.
Similarly, the expression @w.bar
is a FROM
clause expression
because it appears in the FROM
clause of the sfw_query of line 3,
in which it is immediately nested. In contrast, the expressions w.a
and u.b
are not FROM
clause expressions. Though they are nested
into the FROM
clause of the query of lines 1-4, they are not
immediately nested into the query of lines 1-4.