Implementing a Table
If there are inaccuracies discovered with this documentation, please submit a GitHub issue. |
Introduction
A Table
, in partiql-lang-kotlin
, is representative of a value within the database environment.
In some cases, a Table
may represent a SQL relational table, however, in other cases, it may represent an arbitrary PartiQL value (such as an integer, string, or bag).
The PartiQL Library allows for the integration of custom implementations of tables for the planning and execution of PartiQL queries.
Who is this for?
This usage guide is aimed at developers who are implementing a catalog and would like to provide type information and/or physical data for the variables being referenced in either a PartiQL query or plan.
Examples of custom implementations of a Table
may represent:
Prerequisites
To get the APIs discussed in this usage guide, please take a dependency on the SPI package.
dependencies {
implementation("org.partiql:partiql-spi:1.+")
}
This usage guide references many aspects of the Implementing a Catalog usage guide. Please read the usage guide before continuing.
Implementation Overview
In this guide, we will follow the goal of Implementing a Catalog by creating tables that are representative of a set of Ion files.
Let’s start with creating the class. Below, we will assume that the name and path of the table have been resolved via the catalog implementation.
public final class IonTable implements Table {
private final Name name;
private final Path path;
public IonTable(@NotNull Name name, @NotNull Path path) {
this.name = name;
this.path = path;
}
@NotNull
@Override
public Name getName() {
return this.name;
}
}
Retrieving Type Information
If you plan on executing queries against your data, it is highly recommended to provide as much type information to the planner as possible. By doing this, the planner may optimize the execution plan before compiling.
Static Typing
For this example, we will assume that each data/table file is accompanied by a metadata file that contains type information.
This metadata file will contain a prefixed period (.
) before the filename.
Therefore, for a table with name a.b.c
, it will have its data stored in
${HOME}/ionfs/a/b/c.ion
and its type information stored in ${HOME}/ionfs/a/b/.c.ion
.
${HOME} └── ionfs └── a └── b └── c.ion # table data stored here └── .c.ion # type information stored here
To retrieve this type information, we’ll modify our constructor to grab the metadata file.
public final class IonTable implements Table {
private final Name name;
private final Path dataPath;
private final Path metadataPath;
public IonTable(@NotNull Name name, @NotNull Path dataPath) {
this.name = name;
this.dataPath = dataPath;
// Find Metadata File
String dataFileName = dataPath.getFileName().toString();
String metadataFileName = "." + dataFileName;
this.metadataPath = dataPath.getParent().resolve(metadataFileName);
}
}
When Table#getSchema()
is invoked, we will open the metadata file, parse, and return the associated type information of the table.
private final List<Field> _fieldTypes = new ArrayList<>() {{
add(Field.of("col1", PType.integer()));
add(Field.of("col2", PType.string()));
add(Field.of("col3", PType.bag(PType.integer())));
}};
private final PType _type = PType.bag(PType.row(_fieldTypes));
@NotNull
@Override
public PType getSchema() {
File metadataFile = metadataPath.toFile();
if (!metadataFile.exists() || !metadataFile.isFile()) {
throw new RuntimeException("Internal error: there isn't a metadata file for " + name + ".");
}
return getTypeInfo(metadataFile);
}
/**
* This is a placeholder for how getting the type info from the metadata file.
* @param metadataFile the file to open and parse into a {@link PType}.
* @return the type of the data
*/
@NotNull
private PType getTypeInfo(@NotNull File metadataFile) {
return _type;
}
Note that, for the above snippet, we didn’t actually open up the metadata file. For the purposes of simplicity, we are always returning a bag of rows with three fields (col1, col2, and col3). For your own use-case, however, you can (and should) implement your own metadata representation, parse them, and retrieve as much accurate type information as possible.
Also, note that for the above snippet, we are opting to open the metadata file on every invocation of
Table#getSchema()
.
We could have instead decided to parse the type information on the construction of the class instead and cache the information.
These are design decisions that implementers of a Table
should take into consideration.
Dynamic Typing
Now, outside of this example, if you do not have any type information, you may return the dynamic type. Remember, in cases where you have type information, is it strongly advised that you provide it. The implementation below is NOT recommended unless you do not have any type information to offer.
@NotNull
@Override
public PType getSchema() {
return PType.dynamic();
}
Marking your table as the dynamic type may incur heavy computational costs. |
By specifying that the type of the data is dynamic, operations that interact with the data (such as paths, scalar functions, aggregate functions, and arithmetic operators) may be dynamically dispatched, incurring a heavy computational cost relative to its static implementation.
If you have some type information, you are still strongly encouraged to provide it. In the below code snippet, we are indicating that some of the data’s type information is unknown.
private final List<Field> _fieldTypes = new ArrayList<>() {{
add(Field.of("col1", PType.integer()));
add(Field.of("col2", PType.dynamic())); // col2 does not have type info!
add(Field.of("col3", PType.bag(PType.dynamic()))); // col3 is always a bag, but we don't know what types of elements it has!
}};
private final PType _type = PType.bag(PType.row(_fieldTypes));
@NotNull
@Override
public PType getSchema() {
return _type;
}
Retrieving Data
When it finally comes to executing PartiQL queries, the evaluator will call Table#getDatum()
.
If you are not familiar with Datum
, please read the Using Datum usage guide.
@NotNull
@Override
public Datum getDatum() {
File dataFile = dataPath.toFile();
if (!dataFile.exists() || !dataFile.isFile()) {
throw new RuntimeException("Internal error: there isn't a data file for " + name + ".");
}
return Datum.bag(getRows())
}
/**
* This is a placeholder for deserializing data from the local Ion file. In this example, we just returns
* a collection of rows.
*/
@NotNull
private Iterable<Datum> getRows() {
return new ArrayList<>() {{
add(
Datum.row(
_fieldTypes,
Field.of("col1", Datum.integer(3)),
Field.of("col2", Datum.string("This is now a string!")),
Field.of("col3", Datum.bag(Datum.integer(1)))
)
);
}};
}
Above, we didn’t actually open the Ion file to retrieve the data.
However, this example symbolically shows how to create a Datum that is lazily provided (via an Iterable ).
|