Variable Resolution
Resolution Rules
A PartiQL query is compiled within a static environment that associates global identifiers with their respective PartiQL types. These types are used during variable resolution to determine if an identifier path possibly refers to a certain value. The word "possible" is key to understanding how PartiQL is able to resolve variables given partial schema. The PartiQL Planner follows the given rules to resolve variables.
-
The path
@i1.i2. . . . .inalways refers to a bound variable namedi1; if there is no such possible variable andi1.i2. . . . .im,m ≤ nis a database name theni1.i2. . . . .imrefers to that database name. If there is a choice, choose the largestm. If both the resolution to variable and the resolution to database name fail, returnMISSINGor fail execution. -
If
i1.i2. . . . .inis a FROM path andi1.i2. . . . .im,m ≤ nis a possible database name, theni1.i2. . . . .imrefers to that database name andim+1. . . . .inis a series of tuple path navigations starting from the database namei1.i2. . . . .im. If there is a choice, choose the largestm. If there is no such possible database name, theni1.i2. . . . .imrefers to a variable (matching largestm). If both the resolution to a database name and the resolution to a variable fail, then fail compilation as this identifier path cannot be resolved. -
If
i1.i2. . . . .inis a non-FROM clause expression andi1is an environment variable theni1refers to such variable; if there is no such variable andi1.i2. . . . .im,m ≤ nis a database name, theni1.i2. . . . .imrefers to that database name. If there is a choice, choose the largestm. If both the resolution to variable and the resolution to database name fail, returnMISSINGor fail execution.
Static Environment
For example, let’s see how x and T are resolved given different static environments.
SELECT x FROM T
Assume all static environments are closed.
env0 :: {
T : << closed :: { x: any } >> — bag of closed structs, each with known field x
}
env1 :: {
T : << closed :: { y: any } >> — bag of closed structs, each with known field y
}
env2 :: {
T : << open :: { y: any } >> — bag of open structs, each with known field y
}
env3 :: {
T : << closed :: { x: any } >>, — bag of closed structs, each with known fields x
x : int — int
}
env4 :: {
T : << closed :: { y: any } >>, — bag of closed structs, each with known fields y
x : int — int
}
env5 :: {
T : << open :: { y: any } >>, — bag of open structs, each with known fields y
x : int — int
}
For all environments we have a known global with name T. In the example query, the identifier path T is on the right-hand-side of a FROM clause (ie it’s a FROM path) so we apply resolution rule 2. In all cases, this from path T is resolved to the known global T of the static environment.
Now let’s look how we resolve x in the various environments.
Env0
In the query, the identifier path x is unambiguously bound to T. This enables us to attempt to resolve the query as
SELECT t.x FROM T as t
We apply rule 3 and find that t.x is possible because the static environment tells us the structs of T have a known field x.
Env1
Unlike env0, we cannot resolve the path t.x via Rule 3 because the static environments tells us that it is impossible for a struct of T to have a field x. Next, Rule 3 tells us to check for a global variable x, but again we find it’s not possible. This variable is unable to be resolved, and compilation fails.
Env2
Env2 differs from Env1 because the structs within bag T are defined as having open schema. While we don’t explicitly know that x is a known field of structs in T, we do know that it’s possible for such a field to exist. In this case, we resolve the variable x to the path t.x.
Env3
The case for env3 is the same for env0. The static environment tells us the structs of T have a known field x.
Resolution with Multiple FROM sources
For example, let’s see how x and T are resolved given different static environments.
SELECT x FROM T, S
Assume all static environments are closed.
env0 :: {
T : << closed :: { x: any } >> — bag of closed structs, each with known field x
S : << closed :: { y: any } >> — bag of closed structs, each with known field x
}
env1 :: {
T : << closed :: { z: any } >> — bag of closed structs, each with known field z
S : << closed :: { z: any } >> — bag of closed structs, each with known field z
x : int — Global x of type int
}
Variable Scopes Example
The scoping rules discussed in the present section discuss the resolution of naming conflicts between names defined in the database environment and the variables of the environment variables. The potential for such naming conflicts is driven by the nested data of PartiQL, as illustrated next.
Notice there are a few more naming conventions, pertaining to the use
of attribute names defined in the SELECT clause into the GROUP BY
and ORDER BY clause. These conventions are explained with
the semantics of the respective clauses (see GROUP BY Clause and ORDER BY Clause).
The following example illustrates how SQL compatibility issues and the
needs of navigating into nested data need to be carefully merged
together. Consider the following database that has a table c, i.e. a
collection of tuples, and also named data x.n and y.
t.c: <<
{'a':1, 'n':[{'b':11, 'c':12}]},
{'a':2, 'n':[{'b':21, 'c':22}]}
>>
x.n : << {'b':3} >>
y: {'a':1, 'b':2}
Then consider the query
SELECT t.a
FROM t.c AS x
WHERE x.a IN (SELECT y.b FROM x.n AS y)
This query poses many scoping issues:
-
Does
x.nrefer to the named valuex.nor to thenattribute of the variablex? For SQL compatibility purposes it refers to the named valuex.n. Read below how to refer to the variablex. -
Does
y.brefer to thebattribute of theyattribute or to thebattribute of the named valuey? For SQL compatibility purposes it refers to thebattribute of the variabley.
Notice how SQL compatibility required the database environment to take
priority over the variables environment in the FROM clause and then, vice
versa, the variables environment to take priority over the database
environment in the SELECT clause.
Assume database names coll, v.foo, w. Then in the query
SELECT v.foo
FROM coll AS v, @v.foo AS w,
(SELECT w.a, u.b FROM @w.bar AS u)
AS x
coll refers to the database name. The v in @v.foo refers to the
variable v. If the @ were not there, v.foo would refer to the database
name v.foo. The w in w.a refers to the variable defined in line 2.
Note, the expressions coll and @v.foo are FROM clause
expressions because they appear in the FROM clause of the
sfw_query of lines 1-4, in which they are immediately nested.
Similarly, the expression @w.bar is a FROM clause expression
because it appears in the FROM clause of the sfw_query of line 3,
in which it is immediately nested. In contrast, the expressions w.a
and u.b are not FROM clause expressions. Though they are nested
into the FROM clause of the query of lines 1-4, they are not
immediately nested into the query of lines 1-4.