Implementing a Catalog

If there are inaccuracies discovered with this documentation, please submit a GitHub issue.

Introduction

A catalog is a representation of a singular database instance holding data (tables and objects), user-defined functions, views, and more. The PartiQL Library allows for the integration of custom catalogs for planning and execution of PartiQL queries.

Who is this for?

This usage guide is aimed at developers who want to provide type information, data (of any source or format) or functions (scalar or aggregations) to the PartiQL planner or compiler.

Examples of implementing a custom catalog include:

  1. Allowing users to query local files (as tables) from your filesystem

  2. Allowing users to query Amazon S3 files (as tables)

  3. Providing users with a robust set of built-in functions for their use

Prerequisites

To get the APIs discussed in this usage guide, please take a dependency on the SPI package.

build.gradle.kts
dependencies {
    implementation("org.partiql:partiql-spi:1.+")
}

Implementation Overview

In the following example, we will be creating a Catalog that acts as a wrapper over a set of Ion files in your local filesystem. We will also be providing user-defined scalar and aggregation functions.

IonCatalog.java
final class IonCatalog implements Catalog {

    @NotNull
    @Override
    public String getName() {
        return "ionfs";
    }
}

Providing Data

To provide data (representative of a table or global binding) to the PartiQL ecosystem, we make use of Catalog#resolveTable (for resolution) and Catalog#getTable (for data and type retrieval).

IonCatalog.java
    @Nullable
    @Override
    public Name resolveTable(@NotNull Session session, @NotNull Identifier identifier) {
        // TODO!
        return null;
    }

    @Nullable
    @Override
    public Table getTable(@NotNull Session session, @NotNull Name name) {
        // TODO!
        return null;
    }

Resolving a Table

For our example database, we have a set of directories and files underneath ${HOME}/ionfs. When resolving an identifier, say "a"."b".c, the path in the filesystem will correspond with ${HOME}/ionfs/a/b/c or ${HOME}/ionfs/a/b/C. Note that c is a non-delimited identifier, and therefore represents a case-insensitive lookup. In order to accommodate this, we must traverse the filesystem, taking into consideration whether a particular path step is case-sensitive or not.

The filesystem may look like this:

${HOME}
└── ionfs
    └── a
        └── b
            └── c
            └── C
            └── foo

An implementation of traversing the filesystem can be seen below.

IonCatalog.java
    @Nullable
    @Override
    public Name resolveTable(@NotNull Session session, @NotNull Identifier identifier) {
        File current = new File(java.lang.System.getenv("HOME"), "ionfs");
        if (!current.exists()) {
            return null;
        }
        List<String> resolved = resolveFilesystemPath(current, new ArrayList<>(), identifier.getParts());
        if (resolved == null) {
            return null;
        } else {
            return Name.of(resolved);
        }
    }

    @Nullable
    private List<String> resolveFilesystemPath(@NotNull File current, @NotNull List<String> resolved, @NotNull List<Identifier.Simple> remaining) {
        if (remaining.isEmpty()) {
            // Check if the last element is a file.
            if (!current.isFile()) {
                return null;
            }
            // Rewrite last file name
            int lastIndex = resolved.size() - 1;
            String fileName = resolved.get(lastIndex);
            String fileNameWithoutExtension = fileName.substring(0, fileName.lastIndexOf('.'));
            resolved.set(lastIndex, fileNameWithoutExtension);
            return resolved;
        } else {
            // Since there are remaining steps, the current file must be a directory
            if (!current.isDirectory()) {
                return null;
            }

            // Get the children files
            File[] children = current.listFiles();
            if (children == null) {
                return null;
            }

            // Search for the next file
            Identifier.Simple next = remaining.get(0);
            boolean isCaseSensitive = !next.isRegular();
            for (File child : children) {
                if (isCaseSensitive) {
                    if (child.getName().equals(next.getText())) {
                        resolved.add(child.getName());
                        return resolveFilesystemPath(child, resolved, remaining.subList(1, remaining.size()));
                    }
                } else {
                    if (child.getName().equalsIgnoreCase(next.getText())) {
                        resolved.add(child.getName());
                        return resolveFilesystemPath(child, resolved, remaining.subList(1, remaining.size()));
                    }
                }
            }
            return null;
        }
    }

Retrieving Table’s Type Information and Data

After the table has been resolved, the PartiQL planner and compiler may call Catalog#getTable with the resolved Name to retrieve the underlying data and/or type information. In our example, we need to convert the name to a file path, and pass this to our implementation of a Table. If you haven’t done so already, please take a look at the Implementing a Table usage guide!

IonCatalog.java
    @Nullable
    @Override
    public Table getTable(@NotNull Session session, @NotNull Name name) {
        // Create relative path
        List<String> path = Arrays.stream(name.getNamespace().getLevels()).collect(Collectors.toList());
        path.add(name.getName() + ".ion");

        Path p = Path.of(java.lang.System.getenv("HOME"), path.toArray(new String[0]));
        if (!p.toFile().exists()) {
            return null;
        }
        return new IonTable(name, p); // This is implemented in another usage-guide. See above.
    }

Providing Scalar Functions

Before providing functions, readers should follow instructions on the Implementing Scalar Functions usage guide.

To provide scalar functions to your catalog, please override the getFunctions(session: Session, name: String) method on Catalog.

IonCatalog.java
    @NotNull
    private final Map<String, FnOverload> functions = new HashMap<String, List<FnOverload>>() {{
        put("is_directory", new ArrayList<FnOverload>() {{
            add(is_directory);
        }});
        put("is_symlink", new ArrayList<FnOverload>() {{
            add(is_symlink);
        }});
        put("is_file", new ArrayList<FnOverload>() {{
            add(is_file);
        }});
    }};

    @NotNull
    @Override
    public Collection<FnOverload> getFunctions(@NotNull Session session, @NotNull String name) {
        return functions[name];
    }