# Quick Start

To use `dbnl.eval`, you will need to install the extra 'eval' package as described in [these instructions](/v0.24.x/install-sdk.md#installing-distributional).

1. Create a client to power LLM-as-judge text metrics \[optional]
2. Generate a list of metrics suitable for comparing text\_A to reference text\_B
3. Use `dbnl.eval` to evaluate to compute the list metrics.
4. Publish the augmented dataframe and new metric quantities to DBNL

```python
import dbnl
import os
import pandas as pd
from openai import OpenAI
from dbnl.eval.llm import OpenAILLMClient
from dbnl.eval import evaluate

# 1. create client to power LLM-as-judge metrics
base_oai_client = OpenAI(api_key=os.environ["OPENAI_API_KEY"])
oai_client = OpenAILLMClient.from_existing_client(base_oai_client, llm_model="gpt-3.5-turbo-0125")

eval_df = pd.DataFrame(
    [
        { "prediction":"France has no capital",
          "ground_truth": "The capital of France is Paris",},
        { "prediction":"The capital of France is Toronto",
          "ground_truth": "The capital of France is Paris",},
        { "prediction":"Paris is the capital",
          "ground_truth": "The capital of France is Paris",},
    ] * 4
)

# 2. get text metrics that use target (ground_truth) and LLM-as-judge metrics
text_metrics = dbnl.eval.metrics.text_metrics(
    prediction="prediction", target="ground_truth", eval_llm_client=oai_client
)
# 3. run text metrics that use target (ground_truth) and LLM-as-judge metrics
aug_eval_df = evaluate(eval_df, text_metrics)

# 4. publish to DBNL
dbnl.login(api_token=os.environ["DBNL_API_TOKEN"])
project = dbnl.get_or_create_project(name="DEAL_testing")
cols = dbnl.util.get_column_schemas_from_dataframe(aug_eval_df)
run_schema = dbnl.create_run_schema(columns=cols)
run = dbnl.create_run(project=project, run_schema=run_schema)
dbnl.report_results(run=run, column_data=aug_eval_df)
dbnl.close_run(run=run)
```

You can inspect a subset of the the `aug_eval_df` rows and for example, one of the columns created by one of the metrics in the `text_metrics` list :  `llm_text_similarity_v0`

<table><thead><tr><th width="71">idx</th><th width="225">prediction</th><th width="251">ground_truth</th><th>llm_text_similarity_v0__prediction__ground_truth</th></tr></thead><tbody><tr><td>0</td><td>France has no capital</td><td>The capital of France is Paris</td><td>1</td></tr><tr><td>1</td><td>The capital of France is Toronto</td><td>The capital of France is Paris</td><td>1</td></tr><tr><td>2</td><td>Paris is the capital</td><td>The capital of France is Paris</td><td>5</td></tr></tbody></table>

The values of `llm_text_similarity_v0`qualitatively match our expectations on semantic similarity between the prediction and ground\_truth

The call to [`evaluate()`](/v0.24.x/reference/python-sdk/dbnl.eval.md#evaluate) takes a dataframe and metric list as input and returns a dataframe with extra columns. Each new column holds the value of a metric computation for that row

```python
def evaluate(df: pd.DataFrame, metrics: Sequence[Metric], inplace: bool = False) -> pd.DataFrame:
    """
    Evaluates a set of metrics on a dataframe, returning an augmented dataframe.

    :param df: input dataframe
    :param metrics: metrics to compute
    :param inplace: whether to modify the input dataframe in place
    :return: input dataframe augmented with metrics
    """
```

The column names of the metrics in the returned dataframe include the metric name and the columns that were used in that metrics computation

\
For example the metric named `llm_text_similarity_v0` becomes `llm_text_similarity_v0__prediction__ground_truth` because it takes as input both the column named `prediction` and the column named `ground_truth`


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.dbnl.com/v0.24.x/reference/python-sdk/eval-module/quick-start.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
idx	prediction	ground_truth	llm_text_similarity_v0__prediction__ground_truth
0	France has no capital	The capital of France is Paris	1
1	The capital of France is Toronto	The capital of France is Paris	1
2	Paris is the capital	The capital of France is Paris	5