Functions

abs

Returns the absolute value of the input.

Syntax

abs(expr)

add

Adds the two inputs.

Syntax

add(expr1, expr2)

and

Logical and operation of two or more boolean columns.

Syntax

and(expr1, expr2)

automated_readability_index

Returns the ARI (Automated Readability Index) which outputs a number that approximates the grade level needed to comprehend the text. For example if the ARI is 6.5, then the grade level to comprehend the text is 6th to 7th grade.

Syntax

automated_readability_index(expr)

Example

# Measure reading difficulty of AI output
automated_readability_index(output)
# Returns: 10.2 (10th-11th grade level)

# Flag overly complex responses
automated_readability_index(output) > 14
# Returns: true for college-level complexity

Scale:

  • 1-5: Elementary school

  • 6-8: Middle school

  • 9-12: High school

  • 13+: College level

Use Case: Ensure content matches target audience reading level, maintain consistent simplicity across responses.

bleu

Computes the BLEU score between two columns.

Syntax

bleu(expr1, expr2)

Example

# Measure similarity between expected and actual output
bleu(expected_output, output)
# Returns: 0.78 (0.0 = no match, 1.0 = perfect match)

# Detect low-quality responses in RAG systems
bleu(retrieved_context, output) < 0.3
# Returns: true when AI deviates significantly from source material

Scale:

  • 0.0: No n-gram overlap

  • 0.3-0.5: Moderate similarity

  • 0.7+: High similarity

  • 1.0: Perfect match

Use Case: Evaluate translation quality, measure RAG faithfulness, compare against golden responses.

character_count

Returns the number of characters in a text column.

Syntax

character_count(expr)

Example

# Count characters in AI output
character_count(output)
# Returns: 1250 (for each row)

# Detect truncated responses
character_count(output) < 10
# Returns: true for very short outputs

Use Case: Monitor response length, detect truncation issues, track verbosity over time.

Aliases

  • num_chars

divide

Divides the two inputs.

Syntax

divide(expr1, expr2)

equal_to

Computes the element-wise equal to comparison of two columns.

Syntax

equal_to(expr1, expr2)

Aliases

  • eq

filter

Filters a column using another column as a mask.

Syntax

filter(expr1, expr2)

flesch_kincaid_grade

Returns the Flesch-Kincaid Grade of the given text. This is a grade formula in that a score of 9.3 means that a ninth grader would be able to read the document.

Syntax

flesch_kincaid_grade(expr)

Example

# Measure complexity of AI responses
flesch_kincaid_grade(output)
# Returns: 8.5 (8th-9th grade reading level)

# Flag complex responses
flesch_kincaid_grade(output) > 12
# Returns: true for college-level text

Scale:

  • 1-5: Elementary school

  • 6-8: Middle school

  • 9-12: High school

  • 13+: College level

Use Case: Ensure content matches target audience reading level, identify overly complex or simple responses.

greater_than

Computes the element-wise greater than comparison of two columns. input1 > input2

Syntax

greater_than(expr1, expr2)

Aliases

  • gt

greater_than_or_equal_to

Computes the element-wise greater than or equal to comparison of two columns. input1 >= input2

Syntax

greater_than_or_equal_to(expr1, expr2)

Aliases

  • gte

is_valid_json

Returns true if the input string is valid json.

Syntax

is_valid_json(expr)

Example

# Check if AI output is valid JSON
is_valid_json(output)
# Returns: true for '{"name": "John"}', false for 'not json'

# Track structured output compliance rate
is_valid_json(output)
# Use as a metric to monitor format adherence

Use Case: Monitor structured output quality, validate API responses, track JSON formatting compliance for tool-calling models.

less_than

Computes the element-wise less than comparison of two columns. input1 < input2

Syntax

less_than(expr1, expr2)

Aliases

  • lt

less_than_or_equal_to

Computes the element-wise less than or equal to comparison of two columns. input1 <= input2

Syntax

less_than_or_equal_to(expr1, expr2)

Aliases

  • lte

levenshtein

Returns Damerau-Levenshtein distance between two strings.

Syntax

levenshtein(expr1, expr2)

Example

# Calculate edit distance between input and output
levenshtein(input, output)
# Returns: 15 (number of character edits needed)

# Detect if AI is parroting the input
levenshtein(input, output) < 5
# Returns: true if strings are very similar

Details:

  • Range: 0 (identical strings) to max(len(string1), len(string2))

  • Lower values = more similar strings

  • Counts insertions, deletions, substitutions, and transpositions

Use Case: Detect typos, measure string similarity, identify when AI copies user input.

list_has_duplicate

Returns True if the list has duplicated items.

Syntax

list_has_duplicate(expr)

list_len

Returns the length of lists in a list column.

Syntax

list_len(expr)

Example

# Count number of tools returned by AI
list_len(tools_list)
# Returns: 3 (if tools_list contains ["calculator", "search", "memory"])

# Detect empty tool usage
list_len(tools_list) == 0
# Returns: true for interactions with no tools used

Use Case: Monitor tool usage patterns, track multi-step reasoning complexity, detect missing tool calls.

list_most_common

Most common item in list.

Syntax

list_most_common(expr)

multiply

Multiplies the two inputs.

Syntax

multiply(expr1, expr2)

negate

Returns the negation of the input.

Syntax

negate(expr)

not

Logical not operation of a boolean column.

Syntax

not(expr)

not_equal_to

Computes the element-wise not equal to comparison of two columns.

Syntax

not_equal_to(expr1, expr2)

Aliases

  • neq

or

Logical or operation of two or more boolean columns.

Syntax

or(expr1, expr2)

rouge1

Returns the rouge1 score between two columns.

Syntax

rouge1(expr1, expr2)

rouge2

Returns the rouge2 score between two columns.

Syntax

rouge2(expr1, expr2)

rougeL

Returns the rougeL score between two columns.

Syntax

rougeL(expr1, expr2)

rougeLsum

Returns the rougeLsum score between two columns.

Syntax

rougeLsum(expr1, expr2)

sentence_count

Returns the number of sentences in a text column.

Syntax

sentence_count(expr)

Example

# Count sentences in AI response
sentence_count(output)
# Returns: 5 (for a 5-sentence response)

# Detect overly brief or verbose responses
sentence_count(output) < 2
# Returns: true for single-sentence outputs

Use Case: Monitor response structure, detect overly terse or verbose outputs, track communication style consistency.

Aliases

  • num_sentences

subtract

Subtracts the two inputs.

Syntax

subtract(expr1, expr2)

token_count

Returns the number of tokens in a text column.

Syntax

token_count(expr)

Example

# Count tokens in AI input and output
token_count(input) + token_count(output)
# Returns: 350 (total tokens for the interaction)

# Monitor token usage for cost tracking
token_count(output) > 1000
# Returns: true for responses over 1000 tokens

Use Case: Estimate API costs, monitor token consumption, detect context window issues, track input/output efficiency.

word_count

Returns the number of words in a text column.

Syntax

word_count(expr)

Aliases

  • num_words

Example

# Count words in the output column
word_count(output)
# Returns: 45 (for each row)

# Create a metric for long responses
word_count(output) > 100
# Returns: true/false for each row

Use Case: Track response length, identify verbose or terse outputs, flag unusually long prompts.

Was this helpful?