LakeQL
Overview
  • Introduction
  • Hive Table Manager
Write Pipeline
  • executeWritePipeline
  • Load Strategies
  • Partitioning
Storage
  • Storage Operations
  • API Reference
GitHub
LakeQL
  1. Adapters
  2. Write Pipeline
  3. Partitioning

On this page

  1. Partitioning modes
  2. Timestamp mode
  3. Disabled mode
  4. Field mode
  5. Custom mode
    1. Segment types
  6. Related types

Partitioning

How data is partitioned into Hive-style paths within the all.parquet/ directory.

Partitioning controls how data is organized within the all.parquet/ directory for full_load_append and append strategies. It determines the subdirectory structure using Hive-style key=value paths.

Partitioning only applies to full_load_append and append strategies. For full_load, partitioning is ignored.

Partitioning modes #

The partitioning config value determines the mode:

ValueModeDescription
true (default)TimestampPartitions by write timestamp
falseDisabledFlat file layout (UUID-only paths)
"fieldName"FieldPartitions by a record field's date value
"field:component/..."CustomMulti-segment partitioning with date component extraction

Timestamp mode #

When partitioning: true (the default), each write is partitioned by the current timestamp. The pipeline automatically injects load_timestamp, load_timestamp_year, and load_timestamp_month columns into the records and schema.

1
2
all.parquet/year=2025/month=01/day=15/<uuid>.parquet

The partitioningFormat controls granularity:

FormatExample path
"year"year=2025/<uuid>.parquet
"year/month"year=2025/month=01/<uuid>.parquet
"year/month/day"year=2025/month=01/day=15/<uuid>.parquet

Disabled mode #

When partitioning: false, files are placed directly in all.parquet/ with a flat UUID-based path:

1
2
all.parquet/<uuid>.parquet

Field mode #

When partitioning is a simple field name (e.g. "event_date"), the pipeline extracts the date value from each record's field and partitions accordingly:

1
2
{ "partitioning": "event_date", "partitioningFormat": "year/month" }

For a record with event_date: "2025-03-20":

1
2
all.parquet/year=2025/month=03/<uuid>.parquet

Records are grouped by their partition key — records with the same partition path are written to the same Parquet file.

If a record is missing the partition field, has a null value, or contains an unparseable date, the pipeline throws a PartitionFieldError.

Custom mode #

Custom partitioning allows multi-segment paths with field extraction and date component parsing. The format uses / to separate segments and : to specify a date component:

1
2
{ "partitioning": "region/event_date:year/event_date:month" }

For a record with region: "eu-west-1" and event_date: "2025-03-20":

1
2
all.parquet/region=eu-west-1/year=2025/month=03/<uuid>.parquet

Segment types #

SegmentDescriptionExample output
fieldNameRaw field valuecustomer_id=acme
fieldName:yearYear from ISO dateyear=2025
fieldName:monthMonth from ISO datemonth=03
fieldName:dayDay from ISO dateday=20
fieldName:hourHour from ISO datetimehour=14
fieldName:minuteMinute from ISO datetimeminute=30
fieldName:secondSecond from ISO datetimesecond=45

Related types #

PropertyType
modePartitionMode
formatPartitioningFormat
fieldName?string
formatString?string

Type

"disabled" | "timestamp" | "field" | "custom"

Error thrown when a record’s partition field is missing, null, or invalid.

Constructor

new PartitionFieldError(fieldName, reason, recordIndex, value?)

Parameters

ParameterTypeDefault Value
fieldNamestring—
reason"missing" | "null" | "invalid_date"—
recordIndexnumber—
value?unknown—

Properties

PropertyTypeDefault Value
fieldNamestring—
Modifiers:readonly
reason"missing" | "null" | "invalid_date"—
Modifiers:readonly
recordIndexnumber—
Modifiers:readonly
value?unknown—
Modifiers:readonly

Extends

Error

A parsed segment from a custom partition format string.

Properties

PropertyTypeModifiers
fieldNamestring—

The field name to extract from the record.

component?"year" | "month" | "day" | "hour" | "minute" | "second"—

If set, extract this date component from the field’s ISO date value.

Previous page

Load Strategies

Next page

Storage