Skip to content

Conversation

@gbeane
Copy link
Collaborator

@gbeane gbeane commented Oct 17, 2025

Overview: Probability Calibration via CalibratedClassifierCV

What it is
CalibratedClassifierCV wraps any classifier (RF/GBT/XGBoost, etc.) and learns a mapping from raw model scores to well-calibrated probabilities. It does this with an internal cross-validation loop: in each fold, it fits the base model on train-fold data and learns a calibration function on the fold’s held-out data. Two calibration methods are supported:

  • isotonic (non-parametric, flexible; best with enough data)
  • sigmoid (Platt scaling; smoother, works better on smaller data)

Why calibration matters
Tree models often output overconfident probabilities (lots of ~0.0 or ~1.0). Calibration fixes this so that predicted probabilities reflect reality (e.g., among samples with p≈0.7, ~70% are positive). Better calibration improves:

  • Thresholding
  • Loss-based metrics: log-loss / Brier score reflect actual probability quality.
  • User trust: fewer “certain-but-wrong” predictions in JABS.

How we use it in JABS

  • Added optional settings, which are saved in the project.json file: calibrate_probabilities: bool, calibration_method: "isotonic"|"sigmoid", calibration_cv: int.
  • During training (including LOGO cross-validation), probability calibration is fit separately inside each fold. The calibrator is trained only on that fold’s training data and never sees the validation data, which prevents data leakage and keeps validation metrics honest.
  • For feature importance, when calibrated we aggregate importances across the calibrated folds’ base estimators
  • UI: a JABS Settings dialog lets users toggle calibration and choose the method/CV. The settings are persisted in the JABS project.json file.

Practical guidance

  • Reasonable Default: calibrate_probabilities=True, method="isotonic", calibration_cv=3. (however, this PR does not change current behavior, so calibrate_probabilities defaults to False)
  • Use "sigmoid" if folds are small; isotonic needs more data.
  • Avoid very high calibration_cv—typically 3–5 is enough.
  • Always calibrate during validation if you’ll deploy a calibrated final model (metrics should reflect deployed classifier).

Trade-offs

  • Extra compute (fits base model multiple times).
  • Slight variance increase; mitigated by sensible CV (3–5).
  • For extremely imbalanced or tiny folds, prefer "sigmoid" or reduce CV.

Net impact for JABS

  • More honest probabilities -> cleaner threshold selection for behaviors / improved search for low-confidence predictions.
  • Better UX and trust in “confidence” displays.
  • Fewer brittle 0/1 outputs; improved stability across datasets and sessions.

See Also

User Guide

I'm going to hold off on updating the user guide until I'm sure we're going to merge these changes

Settings Dialog Screenshots

image image

@gbeane gbeane marked this pull request as draft October 17, 2025 21:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants