LogoLogo
AboutBlogLaunch app ↗
v0.23.x
v0.23.x
  • Get Started
  • Overview
  • Getting Access to Distributional
  • Install the Python SDK
  • Quickstart
  • Learning about Distributional
    • Distributional Concepts
    • Why We Test Data Distributions
    • The Flow of Data
  • Using Distributional
    • Projects
    • Runs
      • Reporting Runs
      • Setting a Baseline Run
    • Metrics
    • Tests
      • Creating Tests
        • Using Filters in Tests
        • Available Statistics and Assertions
      • Running Tests
      • Reviewing Tests
        • What Is a Similarity Index?
    • Notifications
    • Access Controls
      • Organization and Namespaces
      • Users and Permissions
      • Tokens
  • Platform
    • Sandbox
    • Self-hosted
      • Architecture
      • Deployment
        • Helm Chart
        • Terraform Module
      • Networking
      • OIDC Authentication
      • Data Security
  • Reference
    • Query Language
      • Functions
    • Python SDK
      • dbnl
      • dbnl.util
      • dbnl.experimental
      • Classes
      • Eval Module
        • Quick Start
        • dbnl.eval
        • dbnl.eval.metrics
        • Application Metric Sets
        • How-To / FAQ
        • LLM-as-judge and Embedding Metrics
        • RAG / Question Answer Example
      • Classes
  • CLI
  • Versions
    • Release Notes
Powered by GitBook

© 2025 Distributional, Inc. All Rights Reserved.

On this page

Was this helpful?

Export as PDF
  1. Reference

Python SDK

The primary mechanism for submitting data to Distributional is through our Python SDK. This section servers as a reference for the various functionalities available for interacting with DBNL via the SDK.

Example Usage

Below is a basic working example that highlights the SDK workflow. If you have not yet installed the SDK, follow these instructions.

import dbnl
import numpy as np
import pandas as pd


num_results = 500
test_data = pd.DataFrame({
    "idx": np.arange(num_results),
    "age": np.random.randint(5,95, num_results),
    "loc": np.random.choice(["NY", "CA", "FL"], num_results),
    "churn_gtruth": np.random.choice([True, False], num_results)
})

# example user ml app / model
def churn_predictor(input):
    return 1.0 / (1.0 + np.exp(-(input["age"] / 10.0 - 3.5)))

def evaluate_predicton(data):
    return (data["churn_score"] > 0.5 and data["churn_gtruth"]) or \
            (data["churn_score"] < 0.5 and not data["churn_gtruth"])

test_data["churn_score"] = test_data.apply(churn_predictor, axis=1)
test_data["pred_correct"] = test_data.apply(evaluate_predicton, axis=1)
test_data = test_data.astype({"age": "int", "loc": "category"})

# Use DBNL
dbnl.login(api_token="<COPY_PASTE_DBNL_API_TOKEN>")
proj = dbnl.get_or_create_project(name="example_churn_predictor")
run = dbnl.report_run_with_results(
    project=proj,
    column_data=test_data,
    row_id=["idx"],
)

PreviousFunctionsNextdbnl

Was this helpful?