Python SDK

Reference documentation for the Distributional Python SDK

The Python SDK can be used for programmatically creating projects and uploading data to them.

We recommend using the UI to create projects as part of a normal workflow. This will provide the best experience and most options for project setup. For more information see Projects.

See SDK Log Ingestion for more information and examples on using the SDK to upload log data to your deployment.

Installation

To install the latest SDK, run:

pip install --upgrade dbnl

SDK Functions

login

dbnl.login(*,
    api_token: str | None = None,
    namespace_id: str | None = None,
    api_url: str | None = None,
    app_url: str | None = None
    )None

Create a secure connection to make authenticated requests. After login is run successfully, you will be able to issue secure and authenticated requests against hosted endpoints of the your DBNL Deployment.

  • Parameters:

    • api_token – DBNL API token for Authentication. If None is provided, the environment variable DBNL_API_TOKEN will be used by default.

    • namespace_id – The Namespace ID to use for the session. If None is provided, this will default to default.

    • api_url – The base url of the API of your DBNL Deployment, set at installation. If None is provided, the environment variable DBNL_API_URL will be used by default.

    • app_url – An optional base url of the DBNL Deployment. If None is provided, the app url is inferred from the DBNL_API_URL variable. Please contact your sys admin if you cannot reach the Distributional UI.

  • Returns: None

For a Sandbox deployment, your api_url is set to http://[DEPLOYMENT_LOCATION]/api (e.g. http://localhost:8080/api).

get_or_create_project

dbnl.get_or_create_project(*,
    name: str,
    description: str | None = None
    ) → Project

Get the Project with the specified name or create a new one if it does not exist

  • Parameters:

    • name – Name for the Project

    • description – Description for the Project, defaults to None

  • Raises:

    • DBNLNotLoggedInError – The client is not logged in. See login.

    • DBNLAPIValidationError – The API failed to validate the request.

  • Returns: Newly created or matching existing Project

report_run_with_results

dbnl.report_run_with_results(*,
    project: Project,
    column_data: DataFrame,
    scalar_data: dict[str, Any] | DataFrame | None = None,
    display_name: str | None = None,
    index: list[str] | None = None,
    run_schema: RunSchema | None = None,
    metadata: dict[str, str] | None = None,
    wait_for_close: bool = True
    ) → Run

Create a new Run, report results to it, and close it.

  • Parameters:

    • project – The Project to create the Run in. To create a project see get_or_create_project.

    • column_data – A pandas DataFrame with the results for the columns.

    • scalar_data – An optional dictionary or DataFrame with the results for the scalars, if any.

    • display_name – An optional display name for the Run.

    • index – An optional list of column names to use as the unique identifier for rows in the column data.

    • run_schema – An optional RunSchema to use for the Run. Will be inferred from the data if not provided.

    • metadata – Any additional key:value pairs you want to track.

    • wait_for_close – If True, the function will block for up to 3 minutes until the Run is closed, defaults to True.

  • Raises:

    • DBNLNotLoggedInError – The client is not logged in. See login.

    • DBNLInputValidationError – Input does not conform to expected format. See the DBNL Semantic Convention.

  • Returns: The closed Run with the uploaded data.

SDK Quick Example

Below is a basic working example that highlights the SDK workflow.

DBNL_API_URL = "http://localhost:8080/api"
DBNL_API_TOKEN = ""

import random
from datetime import UTC, datetime, timedelta

import dbnl
import pandas as pd

# Login to dbnl.
dbnl.login(api_url=DBNL_API_URL, api_token=DBNL_API_TOKEN)
# Use current time as reference point.
now = datetime.now(tz=UTC)
# Get or create a new project.
project = dbnl.get_or_create_project(
    name=f"quickstart-{now.isoformat()}",
    schedule="daily",
)

# Backfill 8 days of data.
now_date = now.replace(hour=0, minute=0, second=0, microsecond=0)
start_date = now_date - timedelta(days=9)
end_date = now_date - timedelta(days=1)
for dt in pd.date_range(start_date, end_date):
    dbnl.report_run_with_results(
        project=project,
        data_start_time=dt,
        data_end_time=dt + timedelta(days=1),
        column_data=pd.DataFrame([
            {
                "timestamp": dt + timedelta(minutes=30 * i),
                "input": f"Is {i} an even or odd number?",
                "output": random.choice(["even", "odd"]),
            }
            for i in range(20)
        ]).astype({
            "timestamp": "datetime64[us, UTC]",
            "input": "string",
            "output": "category",
        }),
    )

Was this helpful?