LogoLogo
AboutBlogLaunch app ↗
v0.20.x
v0.20.x
  • Introduction to AI Testing
  • Welcome to Distributional
  • Motivation
  • What is AI Testing?
  • Stages in the AI Software Development Lifecycle
    • Components of AI Testing
  • Distributional Testing
  • Getting Access to Distributional
  • Learning about Distributional
    • The Distributional Framework
    • Defining Tests in Distributional
      • Automated Production test creation & execution
      • Knowledge-based test creation
      • Comprehensive testing with Distributional
    • Reviewing Test Sessions and Runs in Distributional
      • Reviewing and recalibrating automated Production tests
      • Insights surfaced elsewhere on Distributional
      • Notifications
    • Data in Distributional
      • The flow of data
      • Components and the DAG for root cause analysis
      • Uploading data to Distributional
      • Living in your VPC
  • Using Distributional
    • Getting Started
    • Access
      • Organization and Namespaces
      • Users and Permissions
      • Tokens
    • Data
      • Data Objects
      • Run-Level Data
      • Data Storage Integrations
      • Data Access Controls
    • Testing
      • Creating Tests
        • Test Page
        • Test Drawer Through Shortcuts
        • Test Templates
        • SDK
      • Defining Assertions
      • Production Testing
        • Auto-Test Generation
        • Recalibration
        • Notable Results
        • Dynamic Baseline
      • Testing Strategies
        • Test That a Given Distribution Has Certain Properties
        • Test That Distributions Have the Same Statistics
        • Test That Columns Are Similarly Distributed
        • Test That Specific Results Have Matching Behavior
        • Test That Distributions Are Not the Same
      • Executing Tests
        • Manually Running Tests Via UI
        • Executing Tests Via SDK
      • Reviewing Tests
      • Using Filters
        • Filters in the Compare Page
        • Filters in Tests
    • Python SDK
      • Quick Start
      • Functions
        • login
        • Project
          • create_project
          • copy_project
          • export_project_as_json
          • get_project
          • get_or_create_project
          • import_project_from_json
        • Run Config
          • create_run_config
          • get_latest_run_config
          • get_run_config
          • get_run_config_from_latest_run
        • Run Results
          • get_column_results
          • get_scalar_results
          • get_results
          • report_column_results
          • report_scalar_results
          • report_results
        • Run
          • close_run
          • create_run
          • get_run
          • report_run_with_results
        • Baseline
          • create_run_query
          • get_run_query
          • set_run_as_baseline
          • set_run_query_as_baseline
        • Test Session
          • create_test_session
      • Objects
        • Project
        • RunConfig
        • Run
        • RunQuery
        • TestSession
        • TestRecalibrationSession
        • TestGenerationSession
        • ResultData
      • Experimental Functions
        • create_test
        • get_tests
        • get_test_sessions
        • wait_for_test_session
        • get_or_create_tag
        • prepare_incomplete_test_spec_payload
        • create_test_recalibration_session
        • wait_for_test_recalibration_session
        • create_test_generation_session
        • wait_for_test_generation_session
      • Eval Module
        • Quick Start
        • Application Metric Sets
        • How-To / FAQ
        • LLM-as-judge and Embedding Metrics
        • RAG / Question Answer Example
        • Eval Module Functions
          • Index of functions
          • eval
          • eval.metrics
    • Notifications
    • Release Notes
  • Tutorials
    • Instructions
    • Hello World (Sentiment Classifier)
    • Trading Strategy
    • LLM Text Summarization
      • Setting the Scene
      • Prompt Engineering
      • Integration testing for text summarization
      • Practical considerations
Powered by GitBook

© 2025 Distributional, Inc. All Rights Reserved.

On this page
  • Parameters
  • Managing Tags
  • Examples
  • Basic example
  • Using a Run Query as a Baseline
  • When Baseline Run has already been set

Was this helpful?

Export as PDF
  1. Using Distributional
  2. Python SDK
  3. Functions
  4. Test Session

create_test_session

Create a TestSession

PreviousTest SessionNextObjects

Was this helpful?

dbnl.create_test_session(
    *,
    experiment_run: Run,
    baseline: Optional[Union[Run, RunQuery]] = None,
    include_tags: Optional[List[str]] = None,
    exclude_tags: Optional[List[str]] = None,
    require_tags: Optional[List[str]] = None,
) -> :

Start evaluating Tests associated with a Run. Typically, the Run you just completed will be the "Experiment" and you'll compare it to some earlier "Baseline Run".

The Run must already have and be closed before a Test Session can begin.

A Run must be closed for all to be shown on the UI.

Parameters

Arguments
Description

experiment_run

baseline

include_tags

An optional list of Test Tags to be included. All Tests with any of the tags in this list will be ran after the run is complete.

exclude_tags

An optional list of Test Tags to be excluded. All Tests with any of the tags in this list will be skipped after the run is complete.

require_tags

An optional list of Test Tags that are required. Only tests with all the tags in this list will be ran after the run is complete.

Managing Tags

Suppose we have the following Tests with the associated Tags in our Project

  • Test1 with tags ["A", "B"]

  • Test2 with tags ["A"]

  • Test3 with tags ["B"]

dbnl.create_test_session(..., include_tags=["A", "B"]) will trigger Tests 1, 2, 3 to be executed.

dbnl.create_test_session(..., require_tags=["A", "B"]) will only trigger Test 1.

dbnl.create_test_session(..., exclude_tags=["A"]) will trigger Test 3.

dbnl.create_test_session(..., include_tags=["A"], exclude_tags=["B"]) will trigger Test 2.

Examples

Basic example

dbnl.create_test_session(
  experiment_run=new_run,
  baseline=baseline_run,
)

Using a Run Query as a Baseline

dbnl.create_test_session(
  experiment_run=new_run,
  baseline=baseline_run_query,
)

When Baseline Run has already been set

dbnl.create_test_session(
  experiment_run=new_run,
)

The dbnl to create the TestSession for.

The dbnl or to compare the experiment_runagainst. If None(or omitted), will use the baselinedefined in the TestConfig

Results reported
uploaded results
Run
Run
RunQuery