Basic Query Generation
This section covers fundamental query generation patterns using Beamline’s rand-select-all-fw strategy, which generates simple SELECT * queries with WHERE clauses. This is the best starting point for understanding how query generation works.
Getting Started with Basic Queries
Simple Transaction Data
Let’s use the simple_transactions.ion script from the test suite as our data source:
rand_processes::{
test_data: rand_process::{
$r: Uniform::{ choices: [5,10] },
$arrival: HomogeneousPoisson:: { interarrival: milliseconds::$r },
$data: {
transaction_id: UUID::{ nullable: false },
marketplace_id: UniformU8::{ nullable: false },
country_code: Regex::{ pattern: "[A-Z]{2}" },
created_at: Instant,
completed: Bool,
description: LoremIpsum::{ min_words:10, max_words:200 },
price: UniformDecimal::{ low: 2.99, high: 99999.99, optional: true }
}
}
}
This script creates transaction data with various field types that the query generator can reference.
Basic Query Generation Command
beamline query basic \
--seed 1234 \
--start-auto \
--script-path simple_transactions.ion \
--sample-count 3 \
rand-select-all-fw \
--tbl-flt-rand-min 1 \
--tbl-flt-rand-max 1 \
--tbl-flt-path-depth-max 1 \
--tbl-flt-pathstep-internal-all \
--tbl-flt-pathstep-final-project \
--tbl-flt-type-final-scalar \
--pred-lt
Generated Output:
SELECT * FROM test_data AS test_data WHERE (test_data.marketplace_id < -5)
SELECT * FROM test_data AS test_data WHERE (test_data.price < 18.418581624952935)
SELECT * FROM test_data AS test_data WHERE (test_data.price < 15.495327785402296)
Understanding the Parameters
--tbl-flt-rand-min 1 --tbl-flt-rand-max 1: Generate exactly 1 predicate per query--tbl-flt-path-depth-max 1: Use only top-level fields (no nested paths)--tbl-flt-pathstep-final-project: Final path step is field projection (.field)--tbl-flt-type-final-scalar: Only reference scalar values (numbers, strings, booleans)--pred-lt: Use only less-than (<) predicates
Different Predicate Types
Comparison Predicates
# Less than predicates
beamline query basic \
--seed 100 \
--start-auto \
--script-path simple_transactions.ion \
--sample-count 3 \
rand-select-all-fw \
--tbl-flt-rand-min 1 \
--tbl-flt-rand-max 1 \
--pred-lt
Output:
SELECT * FROM test_data AS test_data WHERE (test_data.price < 123.45)
SELECT * FROM test_data AS test_data WHERE (test_data.marketplace_id < 42)
Equality Predicates
# Equality predicates
beamline query basic \
--seed 200 \
--start-auto \
--script-path simple_transactions.ion \
--sample-count 3 \
rand-select-all-fw \
--tbl-flt-rand-min 1 \
--tbl-flt-rand-max 1 \
--pred-eq
Output:
SELECT * FROM test_data AS test_data WHERE (test_data.completed = true)
SELECT * FROM test_data AS test_data WHERE (test_data.country_code = 'US')
All Predicate Types
# Use all available predicates for comprehensive testing
beamline query basic \
--seed 300 \
--start-auto \
--script-path simple_transactions.ion \
--sample-count 5 \
rand-select-all-fw \
--tbl-flt-rand-min 1 \
--tbl-flt-rand-max 3 \
--pred-all
Output:
SELECT * FROM test_data AS test_data WHERE (test_data.country_code IN [
'Graecos quidem legendos.',
'Possit et sine.'
] OR (NOT ((test_data.description IS MISSING)) OR
(test_data.description IS MISSING)))
SELECT * FROM test_data AS test_data WHERE (((test_data.transaction_id IS NULL)
AND (test_data.created_at IS NULL)) OR (((test_data.completed IN [
false,
false
] OR NOT ((test_data.completed IS NULL))) AND
((NOT ((test_data.price IS NULL)) OR
(test_data.transaction_id LIKE 'Vidisse.' AND
(test_data.country_code IS NULL))) AND
NOT ((test_data.description IS MISSING)))) OR
(test_data.description <> 'Nec vero.')))
Multiple Predicates
Combining Predicates
Increase predicate count to create more complex WHERE clauses:
# Generate queries with 2-5 predicates
beamline query basic \
--seed 400 \
--start-auto \
--script-path simple_transactions.ion \
--sample-count 3 \
rand-select-all-fw \
--tbl-flt-rand-min 2 \
--tbl-flt-rand-max 5 \
--pred-all
Example Output:
SELECT * FROM test_data AS test_data
WHERE (((((test_data.country_code <> 'Qua maxime ceterorum.') AND
(NOT (test_data.completed IN [ false, true ]) OR
(test_data.description = 'Non faciant.'))) AND
(NOT ((test_data.price IS MISSING)) AND (test_data.price IS MISSING))) OR
test_data.price IN [
-47.936734585045905,
-0.8509689800217544,
24.263479438050297
]) OR ((test_data.created_at = UTCNOW()) OR
(NOT ((test_data.country_code IS MISSING)) AND
(test_data.description IS MISSING))))
Field Types and Query Generation
Numeric Fields
The query generator creates appropriate predicates for numeric types:
rand_processes::{
numeric_test: rand_process::{
$arrival: HomogeneousPoisson:: { interarrival: seconds::1 },
$data: {
count: UniformI32::{ low: 1, high: 1000 },
price: UniformDecimal::{ low: 9.99, high: 999.99 },
score: NormalF64::{ mean: 75.0, std_dev: 15.0 }
}
}
}
Generated Queries:
SELECT * FROM numeric_test WHERE (numeric_test.count > 500)
SELECT * FROM numeric_test WHERE (numeric_test.price BETWEEN 50.0 AND 200.0)
SELECT * FROM numeric_test WHERE (numeric_test.score <= 85.5)
String Fields
String generators produce string-appropriate predicates:
rand_processes::{
string_test: rand_process::{
$arrival: HomogeneousPoisson:: { interarrival: seconds::1 },
$data: {
name: LoremIpsumTitle,
email: Format::{ pattern: "user{UUID}@example.com" },
country: Regex::{ pattern: "[A-Z]{2}" },
description: LoremIpsum::{ min_words: 5, max_words: 50 }
}
}
}
Generated Queries:
SELECT * FROM string_test WHERE (string_test.country = 'US')
SELECT * FROM string_test WHERE (string_test.name LIKE '%Test%')
SELECT * FROM string_test WHERE (string_test.email IN ['user1@example.com', 'user2@example.com'])
Boolean Fields
Boolean fields generate boolean predicates:
rand_processes::{
boolean_test: rand_process::{
$arrival: HomogeneousPoisson:: { interarrival: seconds::1 },
$data: {
active: Bool,
verified: Bool::{ p: 0.8 },
premium: Bool::{ p: 0.1 }
}
}
}
Generated Queries:
SELECT * FROM boolean_test WHERE (boolean_test.active = true)
SELECT * FROM boolean_test WHERE (boolean_test.verified AND boolean_test.premium)
SELECT * FROM boolean_test WHERE (NOT boolean_test.active)
Null and Missing Value Queries
Testing Null Handling
When your data includes nullable fields, queries will test null handling:
rand_processes::{
nullable_test: rand_process::{
$arrival: HomogeneousPoisson:: { interarrival: seconds::1 },
$data: {
required_field: UUID::{ nullable: false },
nullable_field: UniformI32::{ nullable: 0.3, low: 1, high: 100 },
optional_field: UniformDecimal::{ optional: 0.2, low: 0.0, high: 1000.0 }
}
}
}
Generated Queries:
SELECT * FROM nullable_test WHERE (nullable_test.nullable_field IS NOT NULL)
SELECT * FROM nullable_test WHERE (nullable_test.optional_field IS MISSING)
SELECT * FROM nullable_test WHERE (nullable_test.required_field IS NOT NULL AND nullable_test.nullable_field > 50)
Focusing on Null/Missing Tests
# Generate queries focused on null/missing testing
beamline query basic \
--seed 500 \
--start-auto \
--script-path nullable_data.ion \
--sample-count 5 \
rand-select-all-fw \
--tbl-flt-rand-min 1 \
--tbl-flt-rand-max 2 \
--pred-absent # Only IS NULL, IS NOT NULL, IS MISSING, IS NOT MISSING
Progressive Query Complexity
Start Simple
# Begin with single predicates
beamline query basic \
--seed 1 \
--start-auto \
--script-path data.ion \
--sample-count 5 \
rand-select-all-fw \
--tbl-flt-rand-min 1 \
--tbl-flt-rand-max 1 \
--pred-eq
Add More Predicates
# Increase to 2-3 predicates
beamline query basic \
--seed 1 \
--start-auto \
--script-path data.ion \
--sample-count 5 \
rand-select-all-fw \
--tbl-flt-rand-min 2 \
--tbl-flt-rand-max 3 \
--pred-comparison
Enable All Predicates
# Use all available predicates for full complexity
beamline query basic \
--seed 1 \
--start-auto \
--script-path data.ion \
--sample-count 5 \
rand-select-all-fw \
--tbl-flt-rand-min 1 \
--tbl-flt-rand-max 5 \
--pred-all
Examples
Single Predicate Queries
From the README example:
beamline query basic \
--seed 1234 \
--start-auto \
--script-path simple_transactions.ion \
--sample-count 3 \
rand-select-all-fw \
--tbl-flt-rand-min 1 \
--tbl-flt-rand-max 1 \
--tbl-flt-path-depth-max 1 \
--tbl-flt-pathstep-internal-all \
--tbl-flt-pathstep-final-project \
--tbl-flt-type-final-scalar \
--pred-lt
Actual Output:
SELECT * FROM test_data AS test_data WHERE (test_data.marketplace_id < -5)
SELECT * FROM test_data AS test_data WHERE (test_data.price < 18.418581624952935)
SELECT * FROM test_data AS test_data WHERE (test_data.price < 15.495327785402296)
Multiple Predicate Queries
From the README example with more complex predicates:
beamline query basic \
--seed 1234 \
--start-auto \
--script-path simple_transactions.ion \
--sample-count 3 \
rand-select-all-fw \
--tbl-flt-rand-min 3 \
--tbl-flt-rand-max 10 \
--tbl-flt-path-depth-max 1 \
--tbl-flt-pathstep-internal-all \
--tbl-flt-pathstep-final-project \
--tbl-flt-type-final-all \
--pred-all
Actual Output:
SELECT * FROM test_data AS test_data WHERE (test_data.country_code IN [
'Graecos quidem legendos.',
'Possit et sine.'
] OR (NOT ((test_data.description IS MISSING)) OR
(test_data.description IS MISSING)))
SELECT * FROM test_data AS test_data WHERE (((test_data.transaction_id IS NULL)
AND (test_data.created_at IS NULL)) OR (((test_data.completed IN [
false,
false
] OR NOT ((test_data.completed IS NULL))) AND
((NOT ((test_data.price IS NULL)) OR
(test_data.transaction_id LIKE 'Vidisse.' AND
(test_data.country_code IS NULL))) AND
NOT ((test_data.description IS MISSING)))) OR
(test_data.description <> 'Nec vero.')))
SELECT * FROM test_data AS test_data
WHERE (((((test_data.country_code <> 'Qua maxime ceterorum.') AND
(NOT (test_data.completed IN [ false, true, true ]) OR
(test_data.description = 'Non faciant.'))) AND
(NOT ((test_data.price IS MISSING)) AND (test_data.price IS MISSING))) OR
test_data.price IN [
-47.936734585045905,
-0.8509689800217544,
24.263479438050297,
-48.953369038690255
]) OR ((test_data.created_at = UTCNOW()) OR
(NOT ((test_data.country_code IS MISSING)) AND
(test_data.description IS MISSING))))
Specific Predicate Types
Comparison Predicates
# Only numeric comparisons
beamline query basic \
--seed 100 \
--start-auto \
--script-path numeric_data.ion \
--sample-count 5 \
rand-select-all-fw \
--tbl-flt-rand-min 1 \
--tbl-flt-rand-max 1 \
--pred-comparison # <, <=, >, >=, =, <>
Example Results:
SELECT * FROM test_data WHERE (test_data.price >= 50.0)
SELECT * FROM test_data WHERE (test_data.count <= 500)
SELECT * FROM test_data WHERE (test_data.score <> 75.5)
String Pattern Matching
# Focus on LIKE predicates
beamline query basic \
--seed 200 \
--start-auto \
--script-path text_data.ion \
--sample-count 5 \
rand-select-all-fw \
--tbl-flt-rand-min 1 \
--tbl-flt-rand-max 1 \
--pred-like
Example Results:
SELECT * FROM test_data WHERE (test_data.description LIKE '%lorem%')
SELECT * FROM test_data WHERE (test_data.country_code LIKE 'U_')
Set Membership
# Use IN and NOT IN predicates
beamline query basic \
--seed 300 \
--start-auto \
--script-path categorical_data.ion \
--sample-count 5 \
rand-select-all-fw \
--tbl-flt-rand-min 1 \
--tbl-flt-rand-max 1 \
--pred-in
Example Results:
SELECT * FROM test_data WHERE (test_data.status IN ['active', 'pending'])
SELECT * FROM test_data WHERE (test_data.category IN ['electronics', 'books', 'clothing'])
Reproducible Query Testing
Test Suite Generation
Create consistent test suites:
#!/bin/bash
# Generate reproducible query test suite
SCRIPT="test_schema.ion"
BASE_SEED=12345
# Simple queries for basic functionality
beamline query basic \
--seed $BASE_SEED \
--start-auto \
--script-path $SCRIPT \
--sample-count 10 \
rand-select-all-fw \
--tbl-flt-rand-min 1 \
--tbl-flt-rand-max 1 \
--pred-eq > basic_equality.sql
# Comparison queries for numeric testing
beamline query basic \
--seed $((BASE_SEED + 1)) \
--start-auto \
--script-path $SCRIPT \
--sample-count 10 \
rand-select-all-fw \
--tbl-flt-rand-min 1 \
--tbl-flt-rand-max 1 \
--pred-comparison > numeric_comparisons.sql
# Complex queries for comprehensive testing
beamline query basic \
--seed $((BASE_SEED + 2)) \
--start-auto \
--script-path $SCRIPT \
--sample-count 15 \
rand-select-all-fw \
--tbl-flt-rand-min 2 \
--tbl-flt-rand-max 5 \
--pred-all > complex_queries.sql
echo "Generated test suite:"
echo "- basic_equality.sql: $(wc -l < basic_equality.sql) queries"
echo "- numeric_comparisons.sql: $(wc -l < numeric_comparisons.sql) queries"
echo "- complex_queries.sql: $(wc -l < complex_queries.sql) queries"
Regression Testing
# Generate baseline queries
beamline query basic \
--seed 999 \
--start-auto \
--script-path stable_schema.ion \
--sample-count 20 \
rand-select-all-fw \
--tbl-flt-rand-min 1 \
--tbl-flt-rand-max 3 \
--pred-all > baseline_queries.sql
# Later: regenerate with same seed to verify no regressions
beamline query basic \
--seed 999 \
--start-auto \
--script-path stable_schema.ion \
--sample-count 20 \
rand-select-all-fw \
--tbl-flt-rand-min 1 \
--tbl-flt-rand-max 3 \
--pred-all > regression_test_queries.sql
# Verify identical output
diff baseline_queries.sql regression_test_queries.sql
Common Query Patterns
Data Validation Queries
Generate queries that validate data constraints:
# Focus on range and constraint validation
beamline query basic \
--seed 600 \
--start-auto \
--script-path validation_schema.ion \
--sample-count 10 \
rand-select-all-fw \
--pred-comparison --pred-between
Example Validation Queries:
SELECT * FROM test_data WHERE (test_data.age BETWEEN 0 AND 120)
SELECT * FROM test_data WHERE (test_data.price > 0)
SELECT * FROM test_data WHERE (test_data.email IS NOT NULL)
Performance Baseline Queries
Create simple queries for performance baseline:
# Simple queries for baseline performance measurement
beamline query basic \
--seed 700 \
--start-auto \
--script-path performance_data.ion \
--sample-count 25 \
rand-select-all-fw \
--tbl-flt-rand-min 1 \
--tbl-flt-rand-max 1 \
--pred-eq --pred-lt --pred-gt
Integration with Data Generation
Complete Workflow
#!/bin/bash
# Complete data + query generation workflow
SCRIPT="customer_transactions.ion"
SEED=12345
echo "Generating test data..."
beamline gen data \
--seed $SEED \
--start-auto \
--script-path $SCRIPT \
--sample-count 1000 \
--output-format ion-pretty > test_data.ion
echo "Generating basic queries..."
beamline query basic \
--seed $((SEED + 1)) \
--start-auto \
--script-path $SCRIPT \
--sample-count 15 \
rand-select-all-fw \
--tbl-flt-rand-min 1 \
--tbl-flt-rand-max 2 \
--pred-all > basic_test_queries.sql
echo "Generated $(wc -l < test_data.ion) data records and $(wc -l < basic_test_queries.sql) test queries"
# Test first few queries (example)
head -5 basic_test_queries.sql | while IFS= read -r query; do
echo "Query: $query"
# your-partiql-engine --query "$query" --data test_data.ion
done
Best Practices
1. Start with Simple Configurations
# Begin testing with minimal complexity
beamline query basic \
--seed 1 \
--start-auto \
--script-path new_schema.ion \
--sample-count 3 \
rand-select-all-fw \
--tbl-flt-rand-min 1 \
--tbl-flt-rand-max 1 \
--pred-eq
2. Match Predicates to Data Types
# For numeric-heavy data
--pred-comparison --pred-between
# For string-heavy data
--pred-like --pred-eq --pred-in
# For boolean data
--pred-eq --pred-logical-not
3. Test Query Coverage
# Generate enough queries to cover different data patterns
beamline query basic \
--seed 100 \
--start-auto \
--script-path comprehensive_data.ion \
--sample-count 50 \
rand-select-all-fw \
--tbl-flt-rand-min 1 \
--tbl-flt-rand-max 4 \
--pred-all
4. Validate Against Real Data
# Always test generated queries work with generated data
beamline gen data --seed 1 --start-auto --script-path schema.ion --sample-count 100 > data.ion
beamline query basic --seed 2 --start-auto --script-path schema.ion --sample-count 5 rand-select-all-fw --pred-all > queries.sql
# Validate each query
# for query in $(cat queries.sql); do
# your-partiql-engine --validate "$query"
# done
Next Steps
Now that you understand basic query generation:
- Advanced Patterns - Complex queries with projections and exclusions
- Parameterization - Complete guide to all query generation options
- CLI Query Commands - Detailed CLI reference with all parameters