Loading...
One general approach to test if two columns are similarly distributed is using a nonparametric statistic. DBNL offers two such statistics: scaled_ks_stat for testing ordinal distributions and scaled_chi2_stat for testing nominal distributions.
scaled_ks_stat
scaled_chi2_stat
{ "name": "discrepancy_of_text_coherence_score", "description": "Test the nonparametric discrepancy of the coherence score distributions", "statistic_name": "scaled_ks_stat", "statistic_params": {}, "assertion": { "name": "less_than_or_equal_to", "params": { "other": 0.25, }, }, "statistic_inputs": [ { "select_query_template": { "select": "{EXPERIMENT}.coherence_score" } }, { "select_query_template": { "select": "{BASELINE}.coherence_score" } }, ], }