merton.backtest.metrics¶
Backtest metrics for PD models.
All metrics here use the Wilcoxon-Mann-Whitney / rank-based identities so
there’s no sklearn dependency. They match sklearn.metrics outputs to
machine precision on the binary-classification case.
Functions¶
|
Area under the ROC curve. |
|
Gini coefficient |
|
Brier score (mean squared error of probabilistic predictions). |
|
Kolmogorov-Smirnov statistic — max gap between the two empirical CDFs. |
|
Hosmer-Lemeshow goodness-of-fit χ² and (chi², dof). |
Module Contents¶
- merton.backtest.metrics.auc(predictions: merton._typing.ArrayLike, defaults: merton._typing.ArrayLike) float[source]¶
Area under the ROC curve.
Uses the Wilcoxon-Mann-Whitney identity: AUC equals the probability that a randomly chosen defaulter has a higher predicted PD than a randomly chosen non-defaulter.
\[\mathrm{AUC} = \frac{R_+ - n_+(n_+ + 1)/2}{n_+ \cdot n_-}\]where
R_+is the sum of ranks of the positive class,n_+is the number of positives, andn_-the number of negatives.
- merton.backtest.metrics.accuracy_ratio(predictions: merton._typing.ArrayLike, defaults: merton._typing.ArrayLike) float[source]¶
Gini coefficient
AR = 2 · AUC − 1.
- merton.backtest.metrics.brier(predictions: merton._typing.ArrayLike, defaults: merton._typing.ArrayLike) float[source]¶
Brier score (mean squared error of probabilistic predictions).
- merton.backtest.metrics.ks_statistic(predictions: merton._typing.ArrayLike, defaults: merton._typing.ArrayLike) float[source]¶
Kolmogorov-Smirnov statistic — max gap between the two empirical CDFs.
- merton.backtest.metrics.hosmer_lemeshow(predictions: merton._typing.ArrayLike, defaults: merton._typing.ArrayLike, *, bins: int = 10) tuple[float, float][source]¶
Hosmer-Lemeshow goodness-of-fit χ² and (chi², dof).
Returns
(chi_squared, degrees_of_freedom). P-values come fromscipy.stats.chi2.sf(chi2, dof).