Partitioning controls how data is organized within the all.parquet/ directory for full_load_append and append strategies. It determines the subdirectory structure using Hive-style key=value paths.
full_load_append and append strategies. For
full_load, partitioning is ignored.Partitioning modes #
The partitioning config value determines the mode:
| Value | Mode | Description |
|---|---|---|
true (default) | Timestamp | Partitions by write timestamp |
false | Disabled | Flat file layout (UUID-only paths) |
"fieldName" | Field | Partitions by a record field's date value |
"field:component/..." | Custom | Multi-segment partitioning with date component extraction |
Timestamp mode #
When partitioning: true (the default), each write is partitioned by the current timestamp. The pipeline automatically injects load_timestamp, load_timestamp_year, and load_timestamp_month columns into the records and schema.
1
2
all.parquet/year=2025/month=01/day=15/<uuid>.parquet
The partitioningFormat controls granularity:
| Format | Example path |
|---|---|
"year" | year=2025/<uuid>.parquet |
"year/month" | year=2025/month=01/<uuid>.parquet |
"year/month/day" | year=2025/month=01/day=15/<uuid>.parquet |
Disabled mode #
When partitioning: false, files are placed directly in all.parquet/ with a flat UUID-based path:
1
2
all.parquet/<uuid>.parquet
Field mode #
When partitioning is a simple field name (e.g. "event_date"), the pipeline extracts the date value from each record's field and partitions accordingly:
1
2
{ "partitioning": "event_date", "partitioningFormat": "year/month" }
For a record with event_date: "2025-03-20":
1
2
all.parquet/year=2025/month=03/<uuid>.parquet
Records are grouped by their partition key — records with the same partition path are written to the same Parquet file.
PartitionFieldError.Custom mode #
Custom partitioning allows multi-segment paths with field extraction and date component parsing. The format uses / to separate segments and : to specify a date component:
1
2
{ "partitioning": "region/event_date:year/event_date:month" }
For a record with region: "eu-west-1" and event_date: "2025-03-20":
1
2
all.parquet/region=eu-west-1/year=2025/month=03/<uuid>.parquet
Segment types #
| Segment | Description | Example output |
|---|---|---|
fieldName | Raw field value | customer_id=acme |
fieldName:year | Year from ISO date | year=2025 |
fieldName:month | Month from ISO date | month=03 |
fieldName:day | Day from ISO date | day=20 |
fieldName:hour | Hour from ISO datetime | hour=14 |
fieldName:minute | Minute from ISO datetime | minute=30 |
fieldName:second | Second from ISO datetime | second=45 |
Related types #
| Property | Type |
|---|---|
| mode | PartitionMode |
| format | PartitioningFormat |
| fieldName? | string |
| formatString? | string |
Type
"disabled" | "timestamp" | "field" | "custom"Error thrown when a record’s partition field is missing, null, or invalid.
Constructor
new PartitionFieldError(fieldName, reason, recordIndex, value?)Parameters
| Parameter | Type | Default Value |
|---|---|---|
| fieldName | string | — |
| reason | "missing" | "null" | "invalid_date" | — |
| recordIndex | number | — |
| value? | unknown | — |
Properties
| Property | Type | Default Value | |
|---|---|---|---|
| fieldName | string | — | |
Modifiers: readonly | |||
| reason | "missing" | "null" | "invalid_date" | — | |
Modifiers: readonly | |||
| recordIndex | number | — | |
Modifiers: readonly | |||
| value? | unknown | — | |
Modifiers: readonly | |||
Extends
ErrorA parsed segment from a custom partition format string.
Properties
| Property | Type | Modifiers |
|---|---|---|
| fieldName | string | — |
The field name to extract from the record. | ||
| component? | "year" | "month" | "day" | "hour" | "minute" | "second" | — |
If set, extract this date component from the field’s ISO date value. | ||