Test That Columns Are Similarly Distributed
One general approach to test if two columns are similarly distributed is using a nonparametric statistic. DBNL offers two such statistics: scaled_ks_stat for testing ordinal distributions and scaled_chi2_stat for testing nominal distributions.
Example Test Spec
{
"name": "discrepancy_of_text_coherence_score",
"description": "Test the nonparametric discrepancy of the coherence score distributions",
"statistic_name": "scaled_ks_stat",
"statistic_params": {},
"assertion": {
"name": "less_than_or_equal_to",
"params": {
"other": 0.25,
},
},
"statistic_inputs": [
{
"select_query_template": {
"select": "{EXPERIMENT}.coherence_score"
}
},
{
"select_query_template": {
"select": "{BASELINE}.coherence_score"
}
},
],
}
Was this helpful?

