Scope
As far as the variables environment is concerned, the scoping rules are identical to those of SQL.
Scoping rules
The scoping rules discussed in the present section discuss the resolution of naming conflicts between names defined in the database environment and the variables of the environment variables. The potential for such naming conflicts is driven by the nested data of PartiQL, as illustrated next.
Notice there are a few more naming conventions, pertaining to the use
of attribute names defined in the SELECT
clause into the GROUP BY
and ORDER BY
clause. These conventions are explained with
the semantics of the respective clauses (see GROUP BY Clause and ORDER BY Clause).
The following example illustrates how SQL compatibility issues and the
needs of navigating into nested data need to be carefully merged
together. Consider the following database that has a table c
, i.e. a
collection of tuples, and also named data x.n
and y
.
t.c: <<
{'a':1, 'n':[{'b':11, 'c':12}]},
{'a':2, 'n':[{'b':21, 'c':22}]}
>>
x.n : << {'b':3} >>
y: {'a':1, 'b':2}
Then consider the query
SELECT t.a
FROM t.c AS x
WHERE x.a IN (SELECT y.b FROM x.n AS y)
This query poses many scoping issues:
-
Does
x.n
refer to the named valuex.n
or to then
attribute of the variablex
? For SQL compatibility purposes it refers to the named valuex.n
. Read below how to refer to the variablex
. -
Does
y.b
refer to theb
attribute of they
attribute or to theb
attribute of the named valuey
? For SQL compatibility purposes it refers to theb
attribute of the variabley
.
Notice how SQL compatibility required the database environment to take
priority over the variables environment in the FROM
clause and then, vice
versa, the variables environment to take priority over the database
environment in the SELECT
clause.
Naming Conflicts
Since the rules are easier to express when all database
names are a single identifier, such as thedb
or "the db"
(as
opposed to paths, such as somedb.sometable
), we first specify the
scoping rules under the assumption that all database names are a
single identifier. We remove the assumption and generalize later.
In the absence of schema the following rules apply
-
@identifier
refers to the environment variable namedidentifier
; if there is no such environment variable, theidentifier
refers to the database nameidentifier
; if there is no such database name either, the query fails compilation. -
in a
FROM
clause path that starts withidentifier
, theidentifier
refers to the database nameidentifier
; if there is no such database name, the refers to a variable; otherwise query fails compilation. [1] -
in a non-
FROM
clause path that starts withidentifier
, theidentifier
refers to the environment variable namedidentifier
; if there is no such environment variable, theidentifier
refers to the database nameidentifier
; if there is no such database name either, the query fails compilation.
Next, we generalize to also allow for the possibility of database
names of the form identifier.identifier. …
. The following rules
apply regarding the semantics of \$i_1.i_2. ... .i_n\$, where
\$i_1, i_2, ..., i_n\$ are identifiers.
-
\$\@i_1.i_2. ... .i_n\$ always refers to the environment variable named \$i_1\$; if there is no such variable and \$i_1.i_2. ... .i_m, m <= n\$ is a database name then \$i_1.i_2. ... .i_m\$ refers to such named database name. Again, if there is a choice, choose the largest \$m\$. If both the resolution to variable and the resolution to database name, fail the query during compilation.
-
if \$i_1.i_2. ... .i_n\$ is a
FROM
path and \$i_1.i_2. ... .i_m, m <= n\$ is a database name then \$i_1.i_2. ... .i_m\$ refers to such named database name and \$i_{m+1}. ... .i_n\$ is a series of tuple path navigations starting from the database name \$i_1.i_2. ... .i_m\$. If there is a choice, choose the largest \$m\$, i.e., the longest database name. -
if \$i_1.i_2. ... .i_n\$ is a non-
FROM
clause expression and \$i_1\$ is an environment variable then \$i_1\$ refers to such variable; if there is no such variable and \$i_1.i_2. ... .i_m, m <= n\$ is a database name then \$i_1.i_2. ... .i_m\$ refers to such named database name. Again, if there is a choice, choose the largest \$m\$. If both the resolution to variable and the resolution to database name, fail the query during compilation.
Assume database names coll
, v.foo
, w
. Then in the query
SELECT v.foo
FROM coll AS v, @v.foo AS w,
(SELECT w.a, u.b FROM @w.bar AS u)
AS x
coll
refers to the database name. The v
in @v.foo
refers to the
variable v
. If the @
were not there, v.foo
would refer to the database
name v.foo
. The w
in w.a
refers to the variable defined in line 2.
Note, the expressions coll
and @v.foo
are FROM
clause
expressions because they appear in the FROM
clause of the
sfw_query of lines 1-4, in which they are immediately nested.
Similarly, the expression @w.bar
is a FROM
clause expression
because it appears in the FROM
clause of the sfw_query of line 3,
in which it is immediately nested. In contrast, the expressions w.a
and u.b
are not FROM
clause expressions. Though they are nested
into the FROM
clause of the query of lines 1-4, they are not
immediately nested into the query of lines 1-4.
FROM
clause path if it appears in the FROM
clause of the SFW query in which it is immediately nested.