LakeQL
Overview
  • Introduction
  • Hive Table Manager
Write Pipeline
  • executeWritePipeline
  • Load Strategies
  • Partitioning
Storage
  • Storage Operations
  • API Reference
GitHub
LakeQL
  1. Adapters
  2. API Reference

On this page

  1. API Reference
    1. @lakeql/adapters
      1. IHiveTableManagerConfig
      2. IHiveTableDefinition
      3. IHiveTableManager
      4. FcreateHiveTableManager
      5. TStorageType
      6. IStorageConfig
      7. TS3Config
      8. IStorageOperations
      9. CStorageError
      10. FcreateStorageOperations
      11. ITableDefinition
      12. IAdapterConfig
      13. IStorageAdapter
      14. TLoadStrategy
      15. IWritePipelineConfig
      16. IWritePipelineInput
      17. FgeneratePartitionPath
      18. FgenerateFlatPath
      19. TPartitionMode
      20. IResolvedPartitioning
      21. FresolvePartitioningConfig
      22. FenrichJsonSchemaWithTimestamp
      23. FinjectLoadTimestamp
      24. CPartitionFieldError
      25. IPartitionSegment
      26. FparsePartitioningFormat
      27. FgenerateCustomPartitionPath
      28. FgroupRecordsByCustomPartition
      29. FexecuteWritePipeline
    2. @lakeql/adapters/trino-hive-s3
      1. ITrinoHiveS3Config
      2. FcreateTrinoHiveS3Adapter

API Reference

Auto-generated reference for all exported types, interfaces, and functions from @lakeql/adapters.

Configuration for the Hive Table Manager.

Properties

PropertyTypeModifiers
clientTrinoClient—

The Trino client instance to use for DDL operations.

bucketstring—

S3 bucket name for external table locations.

Definition of a Hive external table to create.

Properties

PropertyTypeModifiers
catalogstring—

The catalog name.

schemastring—

The schema name.

tableNamestring—

The table name.

externalLocationstring—

S3 location for the external table.

columnsArray<{ name: string; type: string; }>—

SQL column definitions from JSON Schema.

Manages Hive external table DDL operations (DROP + CREATE) for the mutation write pipeline.

Properties

PropertyTypeModifiers
recreateTable(definition: HiveTableDefinition) => Promise<void>—

Drops and recreates a single Hive external table. Executes DROP TABLE IF EXISTS followed by CREATE TABLE.

recreateTablePair(latestDefinition: HiveTableDefinition, allDefinition: HiveTableDefinition) => Promise<void>—

For full_load_append: manages both _latest and _all tables. Creates both tables, and attempts rollback (best-effort drop both) if either creation fails.

buildExternalLocation(path: string) => string—

Builds a properly-formatted external location URI for Hive tables. Uses the s3a:// scheme required by Hive’s Hadoop FileSystem.

Creates a HiveTableManager that executes DDL via the Trino client.

Parameters

ParameterTypeDefault Value
configHiveTableManagerConfig—

Returns

HiveTableManager

Type

"s3" | "minio"

Storage configuration for S3 or MinIO adapters.

Credentials are read internally from environment variables:

  • S3: AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY, AWS_DEFAULT_REGION, AWS_ENDPOINT_URL
  • MinIO: MINIO_ACCESS_KEY_ID, MINIO_SECRET_ACCESS_KEY, requires explicit endpoint

Properties

PropertyTypeModifiers
typeStorageType—

Storage adapter type.

bucketstring—

Bucket name.

region?string—

Region override. Falls back to AWS_DEFAULT_REGION env var for S3.

endpoint?string—

Custom endpoint. Required for MinIO, optional for S3 (falls back to AWS_ENDPOINT_URL env var).

Type

StorageConfig

Storage operations interface for S3-compatible object stores.

Properties

PropertyTypeModifiers
upload(buffer: Uint8Array, targetPath: string) => Promise<void>—

Uploads a file buffer to the specified S3 path.

deletePrefix(prefix: string) => Promise<void>—

Deletes all objects under the given prefix.

Error class for storage operation failures, providing path and operation context.

Constructor

new StorageError(message, path, operation, options?)

Parameters

ParameterTypeDefault Value
messagestring—
pathstring—
operation"upload" | "delete"—
options?ErrorOptions | undefined—

Properties

PropertyTypeDefault Value
pathstring—
Modifiers:public, readonly
operation"upload" | "delete"—
Modifiers:public, readonly

Extends

Error

Creates storage operations backed by files-sdk. Adapter selection is based on config.type:

  • “s3”: reads credentials from AWS_* environment variables
  • ”minio”: reads credentials from MINIO_* environment variables

Parameters

ParameterTypeDefault Value
configStorageConfig—

Returns

StorageOperations

A table definition describing what to create in the storage backend.

Properties

PropertyTypeModifiers
catalogstring—

The catalog name.

schemastring—

The schema name.

tablestring—

The table name.

columnsArray<ColumnDefinition>—

Column definitions (name + backend-specific type).

Base configuration shared by all adapters.

Properties

PropertyTypeModifiers
typestring—

Unique identifier for this adapter type.

The storage adapter interface.

Each adapter implements these methods to provide table management for a specific storage backend (e.g., Hive/S3, Iceberg, ClickHouse).

Type Parameters

ParameterConstraintDefault
TConfigAdapterConfigAdapterConfig

Properties

PropertyTypeModifiers
typeTConfig["type"]readonly

The adapter type identifier.

createTable(definition: TableDefinition) => Promise<void>—

Creates a table in the storage backend. Uses IF NOT EXISTS by default.

dropTable(catalog: string, schema: string, table: string) => Promise<void>—

Drops a table from the storage backend. Uses IF EXISTS — does not throw if the table doesn’t exist.

replaceTable(definition: TableDefinition) => Promise<void>—

Drops and recreates a table. Useful for replacing external tables with updated schemas.

Type

"full_load" | "full_load_append" | "append"

Configuration for the write pipeline.

Properties

PropertyTypeModifiers
loadStrategy?LoadStrategy—

The load strategy for this endpoint.

type?StorageType—

Storage adapter type.

bucketstring—

Bucket name for storing Parquet files.

basePathstring—

The base path for storing Parquet files.

region?string—

Optional region override.

endpoint?string—

Optional custom endpoint for S3-compatible storage.

table{ catalog: string; schema: string; tableName: string; }—

Hive table definition for DDL management.

trinoClientTrinoClient—

The Trino client instance for DDL operations.

partitioning?PartitioningValue—

Partitioning mode. true partitions by write timestamp, false disables, or a string for field-based/custom.

partitioningFormat?"year" | "year/month" | "year/month/day"—

Partition format granularity.

Input for the write pipeline execution.

Properties

PropertyTypeModifiers
recordsArray<Record<string, any>>—

The records to persist. Accepts any array of objects with string keys.

jsonSchemaJsonSchema—

The JSON Schema describing the record structure.

configWritePipelineConfig—

Pipeline configuration.

Generates a Hive-style partition path based on date and format. Format options:

  • “year”: year=YYYY/ <uuid> .parquet
  • ”year/month”: year=YYYY/month=MM/ <uuid> .parquet
  • ”year/month/day”: year=YYYY/month=MM/day=DD/ <uuid> .parquet

Parameters

ParameterTypeDefault Value
dateDate—
format?"year" | "year/month" | "year/month/day"year/month/day

Returns

string

Generates a flat file path (no partitioning).

Returns

string

Type

"disabled" | "timestamp" | "field" | "custom"

The normalized partitioning configuration used internally by the pipeline.

Properties

PropertyTypeModifiers
modePartitionMode—
format"year" | "year/month" | "year/month/day"—
fieldName?string—
formatString?string—

Normalizes the raw partitioning config into a resolved structure.

Parameters

ParameterTypeDefault Value
partitioning?PartitioningValuetrue
partitioningFormat?"year" | "year/month" | "year/month/day"year/month/day

Returns

ResolvedPartitioning

Adds load_timestamp, load_timestamp_year, and load_timestamp_month to JSON Schema for consistent Parquet + Hive DDL derivation. Returns a new schema object (does not mutate the input).

Parameters

ParameterTypeDefault Value
jsonSchemaJsonSchema—

Returns

JsonSchema

Injects load_timestamp, load_timestamp_year, and load_timestamp_month into each record. Returns new record array (does not mutate input).

Parameters

ParameterTypeDefault Value
recordsArray<Record<string, unknown>>—
timestampDate—

Returns

Array<Record<string, unknown>>

Error thrown when a record’s partition field is missing, null, or invalid.

Constructor

new PartitionFieldError(fieldName, reason, recordIndex, value?)

Parameters

ParameterTypeDefault Value
fieldNamestring—
reason"missing" | "null" | "invalid_date"—
recordIndexnumber—
value?unknown—

Properties

PropertyTypeDefault Value
fieldNamestring—
Modifiers:readonly
reason"missing" | "null" | "invalid_date"—
Modifiers:readonly
recordIndexnumber—
Modifiers:readonly
value?unknown—
Modifiers:readonly

Extends

Error

A parsed segment from a custom partition format string.

Properties

PropertyTypeModifiers
fieldNamestring—

The field name to extract from the record.

component?"year" | "month" | "day" | "hour" | "minute" | "second"—

If set, extract this date component from the field’s ISO date value.

Parses a custom partition format string into an array of segments.

Format: “segment/segment/…“ where each segment is either:

  • A plain field name: “customer_id” → extracts raw value
  • A field with date component: “event_date:year” → extracts year from ISO date

Parameters

ParameterTypeDefault Value
formatstring—

Returns

Array<PartitionSegment>

Generates a custom partition path from a record and parsed segments. For each segment:

  • Without component: formats as fieldName=<value>
  • With component: parses the field as ISO date, extracts the component, formats as component=<value> Appends /<uuid>.parquet at the end.

Parameters

ParameterTypeDefault Value
recordRecord<string, unknown>—
segmentsArray<PartitionSegment>—
recordIndex?number0

Returns

string

Groups records by their custom partition path. Records with the same partition key go to the same file (one UUID per unique partition key).

Parameters

ParameterTypeDefault Value
recordsArray<Record<string, unknown>>—
segmentsArray<PartitionSegment>—

Returns

Map<string, Array<Record<string, unknown>>>

Executes the write pipeline:

  1. Convert records to Parquet via

Parameters

ParameterTypeDefault Value
inputWritePipelineInput—

Returns

Promise<void>
Modifiers:async

Configuration for the Trino + Hive + S3 adapter.

Extends

AdapterConfig

Properties

PropertyTypeModifiers
type"trino-hive-s3"—
clientTrinoClient—

The Trino client instance to use for DDL operations.

bucketstring—

S3 bucket name for external table locations.

prefix?string—

Optional base prefix within the bucket (default: “”).

format?"PARQUET" | "ORC" | "AVRO" | "JSON"—

Storage format (default: “PARQUET”).

Adapter for Hive external tables stored on S3.

Generates CREATE TABLE statements with:

  • external_location pointing to s3://<bucket>/<prefix>/<schema>/<table>
  • format set to the configured format (default: PARQUET)

Parameters

ParameterTypeDefault Value
config?Omit<TrinoHiveS3Config, "type">{"prefix":"","format":"PARQUET"}

Returns

StorageAdapter<TrinoHiveS3Config>

Next page

Overview