Shape Commands
The beamline infer-shape command analyzes Ion scripts to infer the data schemas without generating full datasets. This is useful for understanding data structures, creating database schemas, and validating script configurations.
Command Syntax
beamline infer-shape [OPTIONS]
Required Options
Shape inference uses the same core configuration as data generation:
Seed Configuration (Required - choose one)
--seed-auto # Generate random seed automatically
--seed <SEED> # Use specific numeric seed for reproducibility
Start Time Configuration (Required - choose one)
--start-auto # Generate random start time
--start-epoch-ms <EPOCH_MS> # Use Unix timestamp in milliseconds
--start-iso <ISO_8601> # Use ISO 8601 format (e.g., 2024-01-01T00:00:00Z)
Script Configuration (Required - choose one)
--script-path <PATH> # Path to Ion script file
--script <SCRIPT_DATA> # Inline Ion script content
Optional Parameters
Output Format
--output-format <FORMAT> # Shape output format (default: text)
Available formats:
text- Human-readable debug format (default)basic-ddl- SQL DDL format for database schemasbeamline-json- Beamline JSON format for testing
Nullability and Optionality
--default-nullable <true|false> # Set default nullability behavior
--pct-null <PERCENTAGE> # Percentage of NULL values (0.0-1.0)
--default-optional <true|false> # Set default optionality behavior
--pct-optional <PERCENTAGE> # Percentage of MISSING values (0.0-1.0)
Output Formats
Text Format (Default)
Provides detailed type information in Rust debug format:
$ beamline infer-shape --seed-auto --start-auto --script-path sensors.ion
Seed: 17685918364143248531
Start: 2022-12-12T19:52:29.000000000Z
{
"sensors": PartiqlType(
Bag(
BagType {
element_type: PartiqlType(
Struct(
StructType {
constraints: {
Fields(
{
StructField {
name: "d",
ty: PartiqlType(
DecimalP(2, 0),
),
},
StructField {
name: "f",
ty: PartiqlType(
Float64,
),
},
StructField {
name: "i8",
ty: PartiqlType(
Int64,
),
},
},
),
},
},
),
),
},
),
),
}
Use Cases:
- Development and debugging
- Understanding complex data structures
- Validating script configurations
Basic DDL Format
Generates SQL DDL statements for database schema creation:
$ beamline infer-shape \
--seed 7844265201457918498 \
--start-auto \
--script-path sensors-nested.ion \
--output-format basic-ddl
-- Seed: 7844265201457918498
-- Start: 2024-01-01T06:53:06.000000000Z
-- Syntax: partiql_datatype_syntax.0.1
-- Dataset: sensors
"f" DOUBLE,
"i8" INT8,
"id" INT,
"sub" STRUCT<"f": DOUBLE,"o": INT8>,
"tick" INT8
Use Cases:
- Creating database tables
- Database schema documentation
- SQL migration scripts
- Data warehouse setup
Beamline JSON Format
Structured JSON format used by PartiQL testing tools:
$ beamline infer-shape \
--seed-auto \
--start-auto \
--script-path sensors.ion \
--output-format beamline-json
{
seed: -3711181901898679775,
start: 2022-05-22T13:49:57.000000000+00:00,
shapes: {
sensors: partiql::shape::v0::{
type: "bag",
items: {
type: "struct",
constraints: [
ordered,
closed
],
fields: [
{
name: "d",
type: "decimal(2, 0)"
},
{
name: "f",
type: "double"
},
{
name: "i8",
type: "int8"
},
{
name: "tick",
type: "int8"
},
{
name: "w",
type: "decimal(5, 4)"
}
]
}
}
}
}
Use Cases:
- PartiQL conformance testing
- Tool integration
- Automated schema validation
Examples
Basic Shape Inference
# Get basic shape information
beamline infer-shape \
--seed-auto \
--start-auto \
--script-path my_data.ion
# Get reproducible shape with specific seed
beamline infer-shape \
--seed 12345 \
--start-auto \
--script-path my_data.ion \
--output-format text
Database Schema Generation
# Generate SQL DDL for database creation
beamline infer-shape \
--seed 100 \
--start-auto \
--script-path ecommerce.ion \
--output-format basic-ddl > schema.sql
# Use in database creation
psql -d mydb -f schema.sql
Multiple Dataset Schemas
# Infer shapes for complex multi-dataset scripts
beamline infer-shape \
--seed 42 \
--start-auto \
--script-path client-service.ion \
--output-format basic-ddl
This outputs schemas for all datasets defined in the script:
-- Dataset: service
"Account" VARCHAR,
"Operation" VARCHAR,
"Program" VARCHAR,
"Request" VARCHAR,
"StartTime" TIMESTAMP,
"client" VARCHAR,
"success" BOOL
-- Dataset: client_0
"id" VARCHAR,
"request_id" VARCHAR,
"request_time" TIMESTAMP,
"success" BOOL
-- Dataset: client_1
"id" VARCHAR,
"request_id" VARCHAR,
"request_time" TIMESTAMP,
"success" BOOL
Schema with Nullability and Optionality
Configure NULL and MISSING value behavior in schema:
# Schema with all types nullable and optional
beamline infer-shape \
--seed 1 \
--start-auto \
--script-path data.ion \
--default-nullable true \
--default-optional true \
--output-format basic-ddl
# Output includes nullable/optional markers
"age" OPTIONAL TINYINT,
"name" OPTIONAL VARCHAR NULL,
"active" OPTIONAL BOOL
Schema Validation Workflow
Use shape inference to validate scripts before large data generation:
# 1. Validate script syntax and structure
beamline infer-shape \
--seed-auto \
--start-auto \
--script-path new_script.ion
# 2. Generate SQL schema
beamline infer-shape \
--seed 1 \
--start-auto \
--script-path new_script.ion \
--output-format basic-ddl > schema.sql
# 3. Generate small sample to verify
beamline gen data \
--seed 1 \
--start-auto \
--script-path new_script.ion \
--sample-count 5
# 4. Generate full dataset
beamline gen data \
--seed 1 \
--start-auto \
--script-path new_script.ion \
--sample-count 100000
Integration Patterns
Database Schema Creation
#!/bin/bash
# generate-database-schema.sh
SCRIPT="$1"
OUTPUT_DIR="./schemas"
if [ -z "$SCRIPT" ]; then
echo "Usage: $0 <script.ion>"
exit 1
fi
mkdir -p "$OUTPUT_DIR"
# Generate DDL schema
echo "Generating database schema for $SCRIPT..."
beamline infer-shape \
--seed 1 \
--start-auto \
--script-path "$SCRIPT" \
--output-format basic-ddl > "$OUTPUT_DIR/$(basename "$SCRIPT" .ion).sql"
# Generate Beamline JSON for testing
beamline infer-shape \
--seed 1 \
--start-auto \
--script-path "$SCRIPT" \
--output-format beamline-json > "$OUTPUT_DIR/$(basename "$SCRIPT" .ion).json"
echo "Schemas generated in $OUTPUT_DIR/"
CI/CD Schema Validation
#!/bin/bash
# validate-schemas.sh - CI pipeline script
set -e
echo "Validating Ion scripts..."
for script in scripts/*.ion; do
echo "Checking $script..."
# Validate script can generate valid schema
if ! beamline infer-shape \
--seed 1 \
--start-auto \
--script-path "$script" \
--output-format text > /dev/null; then
echo "ERROR: Invalid script $script"
exit 1
fi
echo "✓ $script is valid"
done
echo "All scripts validated successfully!"
Documentation Generation
# Generate documentation for all data scripts
for script in data_scripts/*.ion; do
name=$(basename "$script" .ion)
echo "## $name Dataset" >> SCHEMAS.md
echo '```sql' >> SCHEMAS.md
beamline infer-shape \
--seed 1 \
--start-auto \
--script-path "$script" \
--output-format basic-ddl >> SCHEMAS.md
echo '```' >> SCHEMAS.md
echo "" >> SCHEMAS.md
done
Error Handling
Common Errors
Script Syntax Errors
$ beamline infer-shape --seed-auto --start-auto --script-path invalid.ion
Error: Failed to parse Ion script: Invalid Ion syntax at line 5, column 10
Missing Required Options
$ beamline infer-shape --script-path data.ion
Error: One of --seed-auto or --seed is required
Error: One of --start-auto, --start-epoch-ms, or --start-iso is required
Invalid Output Format
$ beamline infer-shape --seed-auto --start-auto --script-path data.ion --output-format invalid
Error: 'invalid' isn't a valid value for '--output-format <OUTPUT_FORMAT>'
Performance Considerations
Shape inference is very fast since it doesn’t generate actual data:
- Script Parsing: Milliseconds for typical scripts
- Type Inference: Nearly instantaneous
- Output Generation: Minimal overhead
This makes shape inference ideal for:
- Quick script validation
- CI/CD pipeline checks
- Interactive development workflows
- Documentation generation
Best Practices
1. Validate Scripts Early
# Always infer shape before generating large datasets
beamline infer-shape --seed 1 --start-auto --script-path new_script.ion
2. Use Appropriate Output Formats
# DDL for database work
beamline infer-shape --seed 1 --start-auto --script-path data.ion --output-format basic-ddl
# Text for debugging
beamline infer-shape --seed 1 --start-auto --script-path data.ion --output-format text
# JSON for automation
beamline infer-shape --seed 1 --start-auto --script-path data.ion --output-format beamline-json
3. Document Your Schemas
Save schema outputs for reference and version control:
beamline infer-shape \
--seed 1 \
--start-auto \
--script-path production_data.ion \
--output-format basic-ddl > docs/production_schema.sql
4. Use Consistent Seeds
For reproducible schema documentation:
# Always use seed 1 for schema documentation
beamline infer-shape --seed 1 --start-auto --script-path data.ion --output-format basic-ddl
Next Steps
- Database Commands - Create complete databases with schemas
- Schema Guide - Learn about PartiQL type system
- Data Generation - Generate data matching your schemas