Reporting Runs
The full process of reporting a Run ultimately breaks down into three steps:
Creating the Run, which includes defining its structure and any relevant metadata
Reporting the results of the Run, which include columnar data and scalars
Closing the Run to mark it as complete once reporting is finished
Each of these steps can be done separately via our SDK, but it can also be done conveniently with a single SDK function call: dbnl.report_run_with_results
, which is recommended. See Putting it All Together below.
Creating a Run
The important parts of creating a run are providing identifying information — in the form of a name and metadata — and defining the structure of the data you'll be reporting to it. As mentioned in the previous section, this structure is called the Run Schema.
Run Schema
In older versions of DBNL, the job of the schema was done by something called the "Run Config". The Run Config has been fully deprecated, and you should check the SDK reference and update any code you have.
A Run schema defines four aspects of the Run's structure:
Columns (the data each row in your results will contain)
Scalars (any Run-level data you want to report)
Index (which column or columns uniquely identify rows in your results)
Components (functional groups to organize the reported results in the form of a graph)
Columns
Columns are the only required part of a schema and are core to reporting Runs, as they define the shape your results will take. You report your column schema as a list of objects, which contain the following fields:
name
: The name of the columntype
: The type of the column, e.g. int. For a list of available types, see the SDK referencedescription
: A descriptive blurb about what the column iscomponent
: Which part of your application the column belongs to (see Components below)
Scalars
Scalars represent any data that live at the Run level; that is, the represent single data points that apply to your entire Run. For example, you may want to calculate an F1 score for the entirety of a result set for your model. The scalar schema is also a list of objects, and takes on the same fields as the column schema above.
Index
Using the index
field within the schema, you have the ability to designate Unique Identifiers – specific columns which uniquely identify matching results between Runs. Adding this information facilitates more direct comparisons when testing your application's behavior and makes it easier to explore your data.
Components
Components are defined within the components_dag
field of the schema. This defines the topological structure of your app as a Directed Acyclic Graph (DAG). Using this, you can tell DBNL which part of your application different columns correspond to, enabling a more granular understanding of your app's behavior.
You can learn more about creating a Run schema in the SDK reference for dbnl.create_run_schema
. There is also a function to create a Run, but we recommend the method shown in the section below.
Reporting Run Results
Once you've defined the structure of your run, you can upload data to DBNL to report the results of that run. As mentioned above, there are two kinds of results from your run:
The row-level column results (these each represent the data of a single "usage" of your application)
The Run-level scalar results (these represent data that apply to all usages in your Run as a whole)
DBNL expects you to upload your results data in the form of a pandas
DataFrame. Note that scalars can be uploaded as a single-row DataFrame or as a dictionary of values.
There are functions to upload column results and scalar results in the SDK, but, again, we recommend the method in the section below!
Closing a Run
Once you're finished uploading results to DBNL for your Run, the run should be closed, to mark it as ready to be used in Test Sessions. Note that reporting results to a Run will overwrite any existing results, and, once closed, the Run can no longer have results uploaded. If you need to close a Run, there is an SDK function for it, or you can close an open Run from its page on the UI.
Putting it All Together
Now that you understand each step, you can easily integrate all of this into your codebase with a few simple function calls via our SDK:
import dbnl
import pandas as pd
dbnl.login()
proj = dbnl.get_or_create_project(name="My Project")
run_schema = dbnl.create_run_schema(
columns=[
{"name": "error_type", "type": "category", "component": "classifier"},
{"name": "email", "type": "string", "description": "raw email text content from source", "component": "input"},
{"name": "spam-pred", "type": "boolean", "component": "classifier"},
{"name": "email_id", "type": "string", "description": "unique id for each email"},
],
scalars=[
{
"name": "model_F1",
"type": "float",
"description": "F1 Score",
"component": "classifier"
},
{
"name": "model_recall",
"type": "float",
"description": "Model Recall"
}
],
index=["email_id"],
components_dag={
"input": ["classifier"]
"classifier": [],
}
)
# Creates the run, reports results, and closes the run.
run = dbnl.report_run_with_results(
project=proj,
display_name="Run 1 of Email Classifier"
run_schema=run_schema,
column_data=pd.DataFrame({
"error_type": ["none", "none", "none", "none"],
"email": [
"Hello, I am interested in your product. Please send me more information.",
"Congratulations! You've won a lottery. Click here to claim your prize.",
"Hi, can we schedule a meeting for next week?",
"Don't miss out on this limited time offer! Buy now and save 50%."
],
"spam-pred": [False, True, False, True],
"email_id": ["1", "2", "3", "4"]
}),
scalar_data={
"model_F1": 0.8,
"model_recall": 0.74
}
)
Last updated
Was this helpful?