Implementing a Catalog
If there are inaccuracies discovered with this documentation, please submit a GitHub issue. |
Introduction
A catalog is a representation of a singular database instance holding data (tables and objects), user-defined functions, views, and more. The PartiQL Library allows for the integration of custom catalogs for planning and execution of PartiQL queries.
Who is this for?
This usage guide is aimed at developers who want to provide type information, data (of any source or format) or functions (scalar or aggregations) to the PartiQL planner or compiler.
Examples of implementing a custom catalog include:
-
Allowing users to query local files (as tables) from your filesystem
-
Allowing users to query Amazon S3 files (as tables)
-
Providing users with a robust set of built-in functions for their use
Prerequisites
To get the APIs discussed in this usage guide, please take a dependency on the SPI package.
dependencies {
implementation("org.partiql:partiql-spi:1.+")
}
Implementation Overview
In the following example, we will be creating a Catalog that acts as a wrapper over a set of Ion files in your local filesystem. We will also be providing user-defined scalar and aggregation functions.
final class IonCatalog implements Catalog {
@NotNull
@Override
public String getName() {
return "ionfs";
}
}
Providing Data
To provide data (representative of a table or global binding) to the PartiQL ecosystem, we make use of Catalog#resolveTable
(for resolution) and Catalog#getTable
(for data and type retrieval).
@Nullable
@Override
public Name resolveTable(@NotNull Session session, @NotNull Identifier identifier) {
// TODO!
return null;
}
@Nullable
@Override
public Table getTable(@NotNull Session session, @NotNull Name name) {
// TODO!
return null;
}
Resolving a Table
For our example database, we have a set of directories and files underneath ${HOME}/ionfs
. When resolving an identifier, say "a"."b".c
, the path in
the filesystem will correspond with ${HOME}/ionfs/a/b/c
or ${HOME}/ionfs/a/b/C
. Note that c
is a non-delimited identifier, and therefore
represents a case-insensitive lookup. In order to accommodate this, we must traverse the filesystem, taking into consideration whether a particular
path step is case-sensitive or not.
The filesystem may look like this:
${HOME} └── ionfs └── a └── b └── c └── C └── foo
An implementation of traversing the filesystem can be seen below.
@Nullable
@Override
public Name resolveTable(@NotNull Session session, @NotNull Identifier identifier) {
File current = new File(java.lang.System.getenv("HOME"), "ionfs");
if (!current.exists()) {
return null;
}
List<String> resolved = resolveFilesystemPath(current, new ArrayList<>(), identifier.getParts());
if (resolved == null) {
return null;
} else {
return Name.of(resolved);
}
}
@Nullable
private List<String> resolveFilesystemPath(@NotNull File current, @NotNull List<String> resolved, @NotNull List<Identifier.Simple> remaining) {
if (remaining.isEmpty()) {
// Check if the last element is a file.
if (!current.isFile()) {
return null;
}
// Rewrite last file name
int lastIndex = resolved.size() - 1;
String fileName = resolved.get(lastIndex);
String fileNameWithoutExtension = fileName.substring(0, fileName.lastIndexOf('.'));
resolved.set(lastIndex, fileNameWithoutExtension);
return resolved;
} else {
// Since there are remaining steps, the current file must be a directory
if (!current.isDirectory()) {
return null;
}
// Get the children files
File[] children = current.listFiles();
if (children == null) {
return null;
}
// Search for the next file
Identifier.Simple next = remaining.get(0);
boolean isCaseSensitive = !next.isRegular();
for (File child : children) {
if (isCaseSensitive) {
if (child.getName().equals(next.getText())) {
resolved.add(child.getName());
return resolveFilesystemPath(child, resolved, remaining.subList(1, remaining.size()));
}
} else {
if (child.getName().equalsIgnoreCase(next.getText())) {
resolved.add(child.getName());
return resolveFilesystemPath(child, resolved, remaining.subList(1, remaining.size()));
}
}
}
return null;
}
}
Retrieving Table’s Type Information and Data
After the table has been resolved, the PartiQL planner and compiler may call Catalog#getTable
with the resolved Name
to retrieve the
underlying data and/or type information. In our example, we need to convert the name
to a file path, and pass this to our implementation of
a Table
. If you haven’t done so already, please take a look at the Implementing a Table usage guide!
@Nullable
@Override
public Table getTable(@NotNull Session session, @NotNull Name name) {
// Create relative path
List<String> path = Arrays.stream(name.getNamespace().getLevels()).collect(Collectors.toList());
path.add(name.getName() + ".ion");
Path p = Path.of(java.lang.System.getenv("HOME"), path.toArray(new String[0]));
if (!p.toFile().exists()) {
return null;
}
return new IonTable(name, p); // This is implemented in another usage-guide. See above.
}
Providing Scalar Functions
Before providing functions, readers should follow instructions on the Implementing Scalar Functions usage guide.
To provide scalar functions to your catalog, please override the getFunctions(session: Session, name: String)
method on Catalog
.
@NotNull
private final Map<String, FnOverload> functions = new HashMap<String, List<FnOverload>>() {{
put("is_directory", new ArrayList<FnOverload>() {{
add(is_directory);
}});
put("is_symlink", new ArrayList<FnOverload>() {{
add(is_symlink);
}});
put("is_file", new ArrayList<FnOverload>() {{
add(is_file);
}});
}};
@NotNull
@Override
public Collection<FnOverload> getFunctions(@NotNull Session session, @NotNull String name) {
return functions[name];
}