Reporting Runs
Last updated
Was this helpful?
Last updated
Was this helpful?
The full process of reporting a Run ultimately breaks down into three steps:
Creating the Run, which includes defining its structure and any relevant metadata
Reporting the results of the Run, which include columnar data and scalars
Closing the Run to mark it as complete once reporting is finished
Each of these steps can be done separately via our , but it can also be done conveniently with a single SDK function call: dbnl.report_run_with_results
, which is recommended. See below.
The important parts of creating a run are providing identifying information — in the form of a name and metadata — and defining the structure of the data you'll be reporting to it. As mentioned in the previous section, this structure is called the Run Schema.
In older versions of dbnl, the job of the schema was done by something called the "Run Config". The Run Config has been fully deprecated, and you should check the and update any code you have.
A Run schema defines four aspects of the Run's structure:
Columns (the data each row in your results will contain)
Scalars (any Run-level data you want to report)
Index (which column or columns uniquely identify rows in your results)
Components (functional groups to organize the reported results in the form of a graph)
Columns are the only required part of a schema and are core to reporting Runs, as they define the shape your results will take. You report your column schema as a list of objects, which contain the following fields:
name
: The name of the column
description
: A descriptive blurb about what the column is
component
: Which part of your application the column belongs to (see Components below)
Using the index
field within the schema, you have the ability to designate Unique Identifiers – specific columns which uniquely identify matching results between Runs. Adding this information facilitates more direct comparisons when testing your application's behavior and makes it easier to explore your data.
Once you've defined the structure of your run, you can upload data to dbnl to report the results of that run. As mentioned above, there are two kinds of results from your run:
The row-level column results (these each represent the data of a single "usage" of your application)
The Run-level scalar results (these represent data that apply to all usages in your Run as a whole)
dbnl expects you to upload your results data in the form of a pandas
DataFrame. Note that scalars can be uploaded as a single-row DataFrame or as a dictionary of values.
Now that you understand each step, you can easily integrate all of this into your codebase with a few simple function calls via our SDK:
type
: The type of the column, e.g. int. For a list of available types, see
Scalars represent any data that live at the Run level; that is, the represent single data points that apply to your entire Run. For example, you may want to calculate an for the entirety of a result set for your model. The scalar schema is also a list of objects, and takes on the same fields as the column schema above.
Components are defined within the components_dag
field of the schema. This defines the topological structure of your app as a . Using this, you can tell dbnl which part of your application different columns correspond to, enabling a more granular understanding of your app's behavior.
You can learn more about creating a Run schema in the SDK reference for . There is also a , but we recommend the method shown in the .
Check out the section on to see how dbnl can supplement your results with more useful data.
There are functions to upload and in the SDK, but, again, we recommend the method in the !
Once you're finished uploading results to dbnl for your Run, the run should be closed, to mark it as ready to be used in Test Sessions. Note that reporting results to a Run will overwrite any existing results, and, once closed, the Run can no longer have results uploaded. If you need to close a Run, there is an for it, or you can close an open Run from its page on the UI.