Scope

As far as the variables environment is concerned, the scoping rules are identical to those of SQL.

Scoping rules

The scoping rules discussed in the present section discuss the resolution of naming conflicts between names defined in the database environment and the variables of the environment variables. The potential for such naming conflicts is driven by the nested data of PartiQL, as illustrated next.

Notice there are a few more naming conventions, pertaining to the use of attribute names defined in the SELECT clause into the GROUP BY and ORDER BY clause. These conventions are explained with the semantics of the respective clauses (see GROUP BY Clause and ORDER BY Clause).

Example 1.  

The following example illustrates how SQL compatibility issues and the needs of navigating into nested data need to be carefully merged together. Consider the following database that has a table c, i.e. a collection of tuples, and also named data x.n and y.

t.c: <<
    {'a':1, 'n':[{'b':11, 'c':12}]},
    {'a':2, 'n':[{'b':21, 'c':22}]}
>>
x.n : << {'b':3} >>
y: {'a':1, 'b':2}

Then consider the query

SELECT t.a
FROM t.c AS x
WHERE x.a IN (SELECT y.b FROM x.n AS y)

This query poses many scoping issues:

  1. Does x.n refer to the named value x.n or to the n attribute of the variable x? For SQL compatibility purposes it refers to the named value x.n. Read below how to refer to the variable x.

  2. Does y.b refer to the b attribute of the y attribute or to the b attribute of the named value y? For SQL compatibility purposes it refers to the b attribute of the variable y.

Notice how SQL compatibility required the database environment to take priority over the variables environment in the FROM clause and then, vice versa, the variables environment to take priority over the database environment in the SELECT clause.

Naming Conflicts

Since the rules are easier to express when all database names are a single identifier, such as thedb or "the db" (as opposed to paths, such as somedb.sometable), we first specify the scoping rules under the assumption that all database names are a single identifier. We remove the assumption and generalize later.

In the absence of schema the following rules apply

  1. @identifier refers to the environment variable named identifier; if there is no such environment variable, the identifier refers to the database name identifier; if there is no such database name either, the query fails compilation.

  2. in a FROM clause path that starts with identifier, the identifier refers to the database name identifier; if there is no such database name, the refers to a variable; otherwise query fails compilation. [1]

  3. in a non-FROM clause path that starts with identifier, the identifier refers to the environment variable named identifier; if there is no such environment variable, the identifier refers to the database name identifier; if there is no such database name either, the query fails compilation.

Next, we generalize to also allow for the possibility of database names of the form identifier.identifier. …​. The following rules apply regarding the semantics of \$i_1.i_2. ... .i_n\$, where \$i_1, i_2, ..., i_n\$ are identifiers.

  • \$\@i_1.i_2. ... .i_n\$ always refers to the environment variable named \$i_1\$; if there is no such variable and \$i_1.i_2. ... .i_m, m <= n\$ is a database name then \$i_1.i_2. ... .i_m\$ refers to such named database name. Again, if there is a choice, choose the largest \$m\$. If both the resolution to variable and the resolution to database name, fail the query during compilation.

  • if \$i_1.i_2. ... .i_n\$ is a FROM path and \$i_1.i_2. ... .i_m, m <= n\$ is a database name then \$i_1.i_2. ... .i_m\$ refers to such named database name and \$i_{m+1}. ... .i_n\$ is a series of tuple path navigations starting from the database name \$i_1.i_2. ... .i_m\$. If there is a choice, choose the largest \$m\$, i.e., the longest database name.

  • if \$i_1.i_2. ... .i_n\$ is a non-FROM clause expression and \$i_1\$ is an environment variable then \$i_1\$ refers to such variable; if there is no such variable and \$i_1.i_2. ... .i_m, m <= n\$ is a database name then \$i_1.i_2. ... .i_m\$ refers to such named database name. Again, if there is a choice, choose the largest \$m\$. If both the resolution to variable and the resolution to database name, fail the query during compilation.

Example 2.  

Assume database names coll, v.foo, w. Then in the query

SELECT v.foo
FROM coll AS v, @v.foo AS w,
     (SELECT w.a, u.b FROM @w.bar AS u)
         AS x

coll refers to the database name. The v in @v.foo refers to the variable v. If the @ were not there, v.foo would refer to the database name v.foo. The w in w.a refers to the variable defined in line 2.

Note, the expressions coll and @v.foo are FROM clause expressions because they appear in the FROM clause of the sfw_query of lines 1-4, in which they are immediately nested. Similarly, the expression @w.bar is a FROM clause expression because it appears in the FROM clause of the sfw_query of line 3, in which it is immediately nested. In contrast, the expressions w.a and u.b are not FROM clause expressions. Though they are nested into the FROM clause of the query of lines 1-4, they are not immediately nested into the query of lines 1-4.


1. A path is a FROM clause path if it appears in the FROM clause of the SFW query in which it is immediately nested.