Python SDK
Reference documentation for the Distributional Python SDK
The Python SDK can be used for programmatically creating projects and uploading data to them.
See SDK Log Ingestion for more information and examples on using the SDK to upload log data to your deployment.
Installation
To install the latest SDK, run:
pip install --upgrade dbnl
SDK Functions
login
login
dbnl.login(*,
api_token: str | None = None,
namespace_id: str | None = None,
api_url: str | None = None,
app_url: str | None = None
) → None
Create a secure connection to make authenticated requests. After login
is run successfully, you will be able to issue secure and authenticated requests against hosted endpoints of the your DBNL Deployment.
Parameters:
api_token
– DBNL API token for Authentication. IfNone
is provided, the environment variableDBNL_API_TOKEN
will be used by default.namespace_id
– The Namespace ID to use for the session. IfNone
is provided, this will default todefault
.api_url
– The base url of the API of your DBNL Deployment, set at installation. IfNone
is provided, the environment variableDBNL_API_URL
will be used by default.app_url
– An optional base url of the DBNL Deployment. IfNone
is provided, the app url is inferred from theDBNL_API_URL
variable. Please contact your sys admin if you cannot reach the Distributional UI.
Returns:
None
get_or_create_project
get_or_create_project
dbnl.get_or_create_project(*,
name: str,
description: str | None = None
) → Project
Get the Project with the specified name or create a new one if it does not exist
Parameters:
name
– Name for the Projectdescription
– Description for the Project, defaults to None
Raises:
DBNLNotLoggedInError
– The client is not logged in. Seelogin
.DBNLAPIValidationError
– The API failed to validate the request.
Returns: Newly created or matching existing
Project
report_run_with_results
report_run_with_results
dbnl.report_run_with_results(*,
project: Project,
column_data: DataFrame,
scalar_data: dict[str, Any] | DataFrame | None = None,
display_name: str | None = None,
index: list[str] | None = None,
run_schema: RunSchema | None = None,
metadata: dict[str, str] | None = None,
wait_for_close: bool = True
) → Run
Create a new Run, report results to it, and close it.
Parameters:
project
– The Project to create the Run in. To create a project seeget_or_create_project
.column_data
– A pandas DataFrame with the results for the columns.scalar_data
– An optional dictionary or DataFrame with the results for the scalars, if any.display_name
– An optional display name for the Run.index
– An optional list of column names to use as the unique identifier for rows in the column data.run_schema
– An optional RunSchema to use for the Run. Will be inferred from the data if not provided.metadata
– Any additional key:value pairs you want to track.wait_for_close
– If True, the function will block for up to 3 minutes until the Run is closed, defaults to True.
Raises:
DBNLNotLoggedInError
– The client is not logged in. Seelogin
.DBNLInputValidationError
– Input does not conform to expected format. See the DBNL Semantic Convention.
Returns: The closed
Run
with the uploaded data.
SDK Quick Example
Below is a basic working example that highlights the SDK workflow.
DBNL_API_URL = "http://localhost:8080/api"
DBNL_API_TOKEN = ""
import random
from datetime import UTC, datetime, timedelta
import dbnl
import pandas as pd
# Login to dbnl.
dbnl.login(api_url=DBNL_API_URL, api_token=DBNL_API_TOKEN)
# Use current time as reference point.
now = datetime.now(tz=UTC)
# Get or create a new project.
project = dbnl.get_or_create_project(
name=f"quickstart-{now.isoformat()}",
schedule="daily",
)
# Backfill 8 days of data.
now_date = now.replace(hour=0, minute=0, second=0, microsecond=0)
start_date = now_date - timedelta(days=9)
end_date = now_date - timedelta(days=1)
for dt in pd.date_range(start_date, end_date):
dbnl.report_run_with_results(
project=project,
data_start_time=dt,
data_end_time=dt + timedelta(days=1),
column_data=pd.DataFrame([
{
"timestamp": dt + timedelta(minutes=30 * i),
"input": f"Is {i} an even or odd number?",
"output": random.choice(["even", "odd"]),
}
for i in range(20)
]).astype({
"timestamp": "datetime64[us, UTC]",
"input": "string",
"output": "category",
}),
)
Was this helpful?