PartiQL Data Model

EBNF Grammar for PartiQL Values
value            = absent_value
                 | scalar_value
                 | tuple_value
                 | collection_value ;
absent_value     = "NULL"
                 | "MISSING" ;
scalar_value     = "`", ? ion literal ?, "`"
                 | ? sql literal ? ;
tuple_value      = "{" [ string_value, ":", value { ",", string_value, ":", value }] "}" ; (1)
collection_value = array_value
                 | bag_value ;
array_value      = "[", [ value , { "," , value } ], "]" ;
bag_value        = "<<", [ value , { "," , value } ], ">>" ;
1 value cannot be MISSING

PartiQL Values

Example Value
{
  'fileVersion':'1.0',
  'configurationItems':[
    {
        'configurationItemCaptureTime': `2016-08-03T08:56:52.415Z`,
        'resourceId':'h-0337bfe6793cf9e0c',
        'configuration':{
          'hostId':'h-0337bfe6793cf9e0c',
          'hostProperties':{
              'sockets':2,
              'cores':20,
              'totalVCpus':32,
              'instanceType':'m4.medium'
          },
        'tags':{
          'CostCenter':'Prod'
        },
    },
    {
        'configurationItemCaptureTime':`2016-08-03T09:41:56.906Z`,
        'resourceId':'h-0337bfe6793cf9e0c',
        'configurationStateId':3,
        'configuration':{
          'hostId':'h-0337bfe6793cf9e0c',
          'autoPlacement':'off',
          'hostProperties':{
              'sockets':2,
              'cores':20,
              'totalVCpus':32,
              'instanceType':'m3.medium'
          },
        'tags':{
        },
    }
  ]
}

A PartiQL database generally contains one or more PartiQL named values. A name, is an identifier, such as a table name, that is associated with a PartiQL value. Environments defines what these names are, and how SQL qualified names work, in detail.

The type of a value is absent, scalar, tuple, or collection. Further subtyping applies to scalars, tuples, and collections. We will often use the name complex value to refer to any non-scalar and non-absent value. That is, complex values include tuples and collections. A tuple is a set of attribute name/value pairs, where each name is a string (as in SQL). A tuple in the data model is unordered.

PartiQL’s data model extends SQL to Ion’s type system to cover schema-less and nested data. Such values can be directly quoted with backticks.

Unlike SQL, PartiQL allows the possibility of duplicate attribute names, in the interest of compatibility with non-strict JSON/Ion datasets. However PartiQL does not encourage duplicate attribute names; navigation into tuples with the conventional dot notation (paths) is tuned to the assumption that the attribute names are unique.

A collection_value is either an ordered or unordered (EBNF lines 10-11) collection. Both arrays and bags may contain duplicate elements. An array is ordered (similar to a JSON array or Ion list) and each element is accessible by its ordinal position (see specifics of access by position in paths). Arrays are delimited with [ and ]. For example, the value of the attribute configurationItems in the example value is an array. Arrays have size, which is not explicitly denoted but is implied by the number of elements in the array. For example, the size of the configurationItems in example value is 2. The first element of an array corresponds to index 0; the last element corresponds to index size minus one.

In contrast, a bag is unordered (similar to a SQL table) and its elements cannot be accessed by ordinal position. Bags are denoted with and << and >>.

Finally, note that PartiQL has two kinds of absent values:`NULL` and MISSING. The motivation is as follows: Unlike SQL, where a query that refers to a non-existent attribute name is expected to fail during compilation, in semi-structured data one expects a query to operate even if some of the tuples do not define some of the attributes that the query’s paths mention. Hence PartiQL contains the special value MISSING (EBNF line 6), which is the path result in cases where navigation fail to bind to any information. The distinction between MISSING and NULL enables retaining the original distinction between a missing attribute and a null-valued attribute. The utility of MISSING (as opposed to just having NULL) will become further apparent when navigation into semi-structured data and construction of semi-structured results is discussed.

The value MISSING may not appear as an attribute value. Notice that in the interest of readability, the syntax of values does exclude these cases; rather the "value cannot be MISSING" call-out indicates that MISSING cannot appear as an attribute value.

Comparisons to the Relational Model

In summary, the PartiQL data model extended the SQL data model in the following ways:

  1. The elements of an array/bag can be any kind of value—not just tuples. Furthermore they can be heterogeneous. That is, there are no restrictions between the elements of an array/bag. For example, the two tuples in configurationItems array of are heterogeneous because: (i) each tuple has a different set of attributes (for example, the second tuple has configurationStateId while the first does not), (ii) an attribute of a first tuple may map to some type while the same attribute in a second tuple may map to another type.

  2. More broadly, unlike SQL where the values are tables that have homogeneous tuples that have scalars, PartiQL complex values are arbitrary compositions of arrays, bags and tuples — e.g. the top level value of this example value is a tuple and the configurationItems array has two heterogeneous tuples. Note that in this example, the top-level name refers to a value that is not a bag (e.g. a table).

  3. There is a distinction between null-valued attributes and missing attributes.

  4. PartiQL makes an explicit distinction between arrays and bags, where the former have order and their elements can be addressed by ordinal position. [1]


1. Despite the “SQL table is a bag” and “the results of an SQL query is a table” statements of SQL textbooks, SQL also recognizes that the result of a query that has an ORDER BY is a list, i.e., an ordered collection of tuples.