SQL Integration Ingestion
Pull data from existing SQL tables
Formatting
Currently, all columns from the supplied table will be ingested and flattened into Columns as part of the Data Pipeline. Any columns corresponding to the DBNL Semantic Convention will be mapped accordingly.
The following fields are required regardless of which ingestion method you are using:
input
: The text input to the LLM as astring
.output
: The text response from the LLM as astring
.timestamp
: The UTC timecode associated with the LLM call as atimestamptz
.
Creating a new SQL Data Connection
From the Namespace landing page click on "Data Connections" on the left panel. On the Data Connections landing page "+ Add Data Connection" in the upper right. Provide a required name for the Data Connection and an optional description. All Data Connections will be available to any User creating a Project in the Namespace.
Debugging
All Data Pipeline Runs for a Project can be inspected and restarted in the Project Status page.
Supported Integrations
Google BigQuery
Required configuration at Namespace level
Google Application Credentials JSON: The JSON string content of the service account credentials for BigQuery authentication
Google Cloud Project ID: The Google Cloud project ID where BigQuery is enabled. This will be used in the SQLAlchemy connection URL
Required at Project level
Table Name: Table name to be ingested from
Ingestion Delay: How long to wait after UTC midnight to begin ingesting data. It is recommended to wait at least 10 minutes for all data to be loaded into the table.
Optional configuration at Project level
Backfill To: How far back in the table to load data.
Databricks
Required configuration at Namespace level:
Databricks Host: The Databricks host URL (e.g., 'https://adb-1234567890123456.7.azuredatabricks.net')
HTTP Path: The HTTP path to the Databricks SQL endpoint (e.g., 'sql/protocolv1/o/1234567890123456/1234-567890-abcdefg')
Access Token: Databricks application token for authentication
Catalog: The catalog to use in the SQLAlchemy connection URL
Schema: The schema to use in the SQLAlchemy connection URL
Required at Project level
Table Name: Table name to be ingested from
Ingestion Delay: How long to wait after UTC midnight to begin ingesting data. It is recommended to wait at least 10 minutes for all data to be loaded into the table.
Optional configuration at Project level
Backfill To: How far back in the table to load data.
Snowflake
Snowflake integrations are coming soon!
AWS Redshift
AWS Redshift integrations are coming soon!
Was this helpful?