metrics_calibration module

This section contains the Python API reference for the uncertainty_toolbox.metrics_calibration module, which contains code for uncertainty metrics involving calibration.

uncertainty_toolbox.metrics_calibration Module

Metrics for assessing the quality of predictive uncertainty quantification.

uncertainty_toolbox.metrics_calibration.adversarial_group_calibration(y_pred, y_std, y_true, cali_type, prop_type='interval', num_bins=100, num_group_bins=10, draw_with_replacement=False, num_trials=10, num_group_draws=10, verbose=False)

Adversarial group calibration.

Parameters

y_pred (ndarray) – 1D array of the predicted means for the held out dataset.
y_std (ndarray) – 1D array of the predicted standard deviations for the held out dataset.
y_true (ndarray) – 1D array of the true labels in the held out dataset.
cali_type (str) – type of calibration error to measure; one of [“mean_abs”, “root_mean_sq”].
prop_type (str) – “interval” to measure observed proportions for centered prediction intervals, and “quantile” for observed proportions below a predicted quantile.
num_bins (int) – number of discretizations for the probability space [0, 1].
num_group_bins (int) – number of discretizations for group size proportions between 0 and 1.
draw_with_replacement (bool) – True to draw subgroups that draw from the dataset with replacement.
num_trials (int) – number of trials to estimate the worst calibration error per group size.
num_group_draws (int) – number of subgroups to draw per given group size to measure calibration error on.
verbose (bool) – True to print progress statements.

Return type

Namespace

Returns

A Namespace with an array of the group sizes, the mean of the worst calibration errors for each group size, and the standard error of the worst calibration error for each group size

uncertainty_toolbox.metrics_calibration.get_prediction_interval(y_pred, y_std, quantile, recal_model=None)

Return the centered predictional interval corresponding to a quantile.

For a specified quantile level q (must be a float, or a singleton), return the centered prediction interval corresponding to the pair of quantiles at levels (0.5-q/2) and (0.5+q/2), i.e. interval that has nominal coverage equal to q.

Parameters

y_pred (ndarray) – 1D array of the predicted means for the held out dataset.
y_std (ndarray) – 1D array of the predicted standard deviations for the held out dataset.
quantile (ndarray) – The quantile level to check.
recal_model (Optional[IsotonicRegression]) – A recalibration model to apply before computing the interval.

Return type

Namespace

Returns

Namespace containing the lower and upper bound corresponding to the centered interval.

uncertainty_toolbox.metrics_calibration.get_proportion_in_interval(y_pred, y_std, y_true, quantile)

For a specified quantile, return the proportion of points falling into an interval corresponding to that quantile.

Parameters

y_pred (ndarray) – 1D array of the predicted means for the held out dataset.
y_std (ndarray) – 1D array of the predicted standard deviations for the held out dataset.
y_true (ndarray) – 1D array of the true labels in the held out dataset.
quantile (float) – a specified quantile level

Return type

float

Returns

A single scalar which is the proportion of the true labels falling into the prediction interval for the specified quantile.

uncertainty_toolbox.metrics_calibration.get_proportion_lists(y_pred, y_std, y_true, num_bins=100, recal_model=None, prop_type='interval')

Arrays of expected and observed proportions

Return arrays of expected and observed proportions of points falling into intervals corresponding to a range of quantiles. Computations here are not vectorized, in case there are memory constraints.

Parameters

y_pred (ndarray) – 1D array of the predicted means for the held out dataset.
y_std (ndarray) – 1D array of the predicted standard deviations for the held out dataset.
y_true (ndarray) – 1D array of the true labels in the held out dataset.
num_bins (int) – number of discretizations for the probability space [0, 1].
recal_model (Optional[IsotonicRegression]) – an sklearn isotonic regression model which recalibrates the predictions.
prop_type (str) – “interval” to measure observed proportions for centered prediction intervals, and “quantile” for observed proportions below a predicted quantile.

Return type

Tuple[ndarray, ndarray]

Returns

A tuple of two numpy arrays, expected proportions and observed proportions

uncertainty_toolbox.metrics_calibration.get_proportion_lists_vectorized(y_pred, y_std, y_true, num_bins=100, recal_model=None, prop_type='interval')

Arrays of expected and observed proportions

Returns the expected proportions and observed proportion of points falling into intervals corresponding to a range of quantiles. Computations here are vectorized for faster execution, but this function is not suited when there are memory constraints.

Parameters

y_pred (ndarray) – 1D array of the predicted means for the held out dataset.
y_std (ndarray) – 1D array of the predicted standard deviations for the held out dataset.
y_true (ndarray) – 1D array of the true labels in the held out dataset.
num_bins (int) – number of discretizations for the probability space [0, 1].
recal_model (Optional[Any]) – an sklearn isotonic regression model which recalibrates the predictions.
prop_type (str) – “interval” to measure observed proportions for centered prediction intervals, and “quantile” for observed proportions below a predicted quantile.

Return type

Tuple[ndarray, ndarray]

Returns

A tuple of two numpy arrays, expected proportions and observed proportions

uncertainty_toolbox.metrics_calibration.get_proportion_under_quantile(y_pred, y_std, y_true, quantile)

Get the proportion of data that are below the predicted quantile.

Parameters

y_pred (ndarray) – 1D array of the predicted means for the held out dataset.
y_std (ndarray) – 1D array of the predicted standard deviations for the held out dataset.
y_true (ndarray) – 1D array of the true labels in the held out dataset.
quantile (float) – The quantile level to check.

Return type

float

Returns

The proportion of data below the quantile level.

uncertainty_toolbox.metrics_calibration.get_quantile(y_pred, y_std, quantile, recal_model=None)

Return the value corresponding with a quantile.

For a specified quantile level q (must be a float, or a singleton), return the quantile prediction, i.e. bound that has nominal coverage below the bound equal to q.

Parameters

y_pred (ndarray) – 1D array of the predicted means for the held out dataset.
y_std (ndarray) – 1D array of the predicted standard deviations for the held out dataset.
quantile (ndarray) – The quantile level to check.
recal_model (Optional[IsotonicRegression]) – A recalibration model to apply before computing the interval.

Return type

float

Returns

The value at which the quantile is achieved.

uncertainty_toolbox.metrics_calibration.mean_absolute_calibration_error(y_pred, y_std, y_true, num_bins=100, vectorized=False, recal_model=None, prop_type='interval')

Mean absolute calibration error; identical to ECE.

Parameters

y_pred (ndarray) – 1D array of the predicted means for the held out dataset.
y_std (ndarray) – 1D array of the predicted standard deviations for the held out dataset.
y_true (ndarray) – 1D array of the true labels in the held out dataset.
num_bins (int) – number of discretizations for the probability space [0, 1].
vectorized (bool) – whether to vectorize computation for observed proportions. (while setting to True is faster, it has much higher memory requirements and may fail to run for larger datasets).
recal_model (Optional[IsotonicRegression]) – an sklearn isotonic regression model which recalibrates the predictions.
prop_type (str) – “interval” to measure observed proportions for centered prediction intervals, and “quantile” for observed proportions below a predicted quantile.

Return type

float

Returns

A single scalar which calculates the mean absolute calibration error.

uncertainty_toolbox.metrics_calibration.miscalibration_area(y_pred, y_std, y_true, num_bins=100, vectorized=False, recal_model=None, prop_type='interval')

Miscalibration area.

This is identical to mean absolute calibration error and ECE, however the integration here is taken by tracing the area between curves. In the limit of num_bins, miscalibration area and mean absolute calibration error will converge to the same value.

Parameters

y_pred (ndarray) – 1D array of the predicted means for the held out dataset.
y_std (ndarray) – 1D array of the predicted standard deviations for the held out dataset.
y_true (ndarray) – 1D array of the true labels in the held out dataset.
num_bins (int) – number of discretizations for the probability space [0, 1].
vectorized (bool) – whether to vectorize computation for observed proportions. (while setting to True is faster, it has much higher memory requirements and may fail to run for larger datasets).
recal_model (Optional[Any]) – an sklearn isotonic regression model which recalibrates the predictions.
prop_type (str) – “interval” to measure observed proportions for centered prediction intervals, and “quantile” for observed proportions below a predicted quantile.

Return type

float

Returns

A single scalar which calculates the miscalibration area.

uncertainty_toolbox.metrics_calibration.root_mean_squared_calibration_error(y_pred, y_std, y_true, num_bins=100, vectorized=False, recal_model=None, prop_type='interval')

Root mean squared calibration error.

Parameters

y_pred (ndarray) – 1D array of the predicted means for the held out dataset.
y_std (ndarray) – 1D array of the predicted standard deviations for the held out dataset.
y_true (ndarray) – 1D array of the true labels in the held out dataset.
num_bins (int) – number of discretizations for the probability space [0, 1].
vectorized (bool) – whether to vectorize computation for observed proportions. (while setting to True is faster, it has much higher memory requirements and may fail to run for larger datasets).
recal_model (Optional[IsotonicRegression]) – an sklearn isotonic regression model which recalibrates the predictions.
prop_type (str) – “interval” to measure observed proportions for centered prediction intervals, and “quantile” for observed proportions below a predicted quantile.

Return type

float

Returns

A single scalar which calculates the root mean squared calibration error.

uncertainty_toolbox.metrics_calibration.sharpness(y_std)

Return sharpness (a single measure of the overall confidence).

Parameters: y_std (ndarray) – 1D array of the predicted standard deviations for the held out dataset.
Return type: float
Returns: A single scalar which quantifies the average of the standard deviations.