A common strategy for evaluating unstructured text application is to use other LLMs and text embedding models to drive metrics of interest.
Supported LLM and model services
The LLM-as-judge text metrics in dbnl.eval support OpenAI, Azure OpenAI and any other third-party LLM / embedding model provider that is compatible with the OpenAI python client. Specifically, third-party endpoints should (mostly) adhere to the schema of:
TogetherAI (or other OpenAI compatible service / endpoints)
from openai import OpenAI
from dbnl.eval.llm import OpenAILLMClient
base_oai_client = OpenAI(
api_key=os.environ["TOGETHERAI_API_KEY"],
base_url="https://api.together.xyz/v1",
)
eval_llm_client = OpenAILLMClient.from_existing_client(
base_oai_client, llm_model='meta-llama/Meta-Llama-3.1-70B-Instruct-Turbo'
)
Missing Metric Values
It is possible for some of the LLM-as-judge metrics to occasionally return values that are unable to be parsed. These metrics values will surface as None
Distributional is able to accept dataframes including None values. The platform will intelligently filter them when applicable.
Throughput and Rate Limits
LLM service providers often impose request rate limits and token throughput caps. Some example errors that one might encounter are shown below:
{'code': '429', 'message': 'Requests to the Embeddings_Create Operation under
Azure OpenAI API version XXXX have exceeded call rate limit of your current
OpenAI pricing tier. Please retry after 86400 seconds.
Please go here: https://aka.ms/oai/quotaincrease if you would
like to further increase the default rate limit.'}
{'message': 'You have been rate limited. Your rate limit is YYY queries per
minute. Please navigate to https://www.together.ai/forms/rate-limit-increase
to request a rate limit increase.', 'type': 'credit_limit',
'param': None, 'code': None}
{'message': 'Rate limit reached for gpt-4 in organization XXXX on
tokens per min (TPM): Limit WWWWW, Used YYYY, Requested ZZZZ.
Please try again in 1.866s. Visit https://platform.openai.com/account/rate-limits
to learn more.', 'type': 'tokens', 'param': None, 'code': 'rate_limit_exceeded'}
In the event you experience these errors, please work with your LLM service provider to adjust your limits. Additionally, feel free to reach out to Distributional support with the issue you are seeing.