arviz.loo#
- arviz.loo(data, pointwise=None, var_name=None, reff=None, scale=None)[source]#
Compute Pareto-smoothed importance sampling leave-one-out cross-validation (PSIS-LOO-CV).
Estimates the expected log pointwise predictive density (elpd) using Pareto-smoothed importance sampling leave-one-out cross-validation (PSIS-LOO-CV). Also calculates LOO’s standard error and the effective number of parameters. Read more theory here https://arxiv.org/abs/1507.04544 and here https://arxiv.org/abs/1507.02646
- Parameters:
- data: obj
Any object that can be converted to an
arviz.InferenceData
object. Refer to documentation ofarviz.convert_to_dataset()
for details.- pointwise: bool, optional
If True the pointwise predictive accuracy will be returned. Defaults to
stats.ic_pointwise
rcParam.- var_name
str
, optional The name of the variable in log_likelihood groups storing the pointwise log likelihood data to use for loo computation.
- reff: float, optional
Relative MCMC efficiency,
ess / n
i.e. number of effective samples divided by the number of actual samples. Computed from trace by default.- scale: str
Output scale for loo. Available options are:
log
: (default) log-scorenegative_log
: -1 * log-scoredeviance
: -2 * log-score
A higher log-score (or a lower deviance or negative log_score) indicates a model with better predictive accuracy.
- Returns:
ELPDData
object
(inherits
from
pandas.Series
)with
the
following
row/attributes:- elpd_loo:
approximated
expected
log
pointwise
predictive
density
(elpd
) - se:
standard
error
ofthe
elpd
- p_loo:
effective
number
ofparameters
- n_samples:
number
ofsamples
- n_data_points:
number
ofdata
points
- warning: bool
True if the estimated shape parameter of Pareto distribution is greater than
good_k
.- loo_i:
DataArray
with
the
pointwise
predictive
accuracy, only if pointwise=True
- pareto_k:
array
ofPareto
shape
values
,only
if
pointwise
True
- scale:
scale
ofthe
elpd
- good_k:
For
a
sample
size
S,the
thresold
is
compute
as
min
(1 - 1/log10(S), 0.7) The returned object has a custom print method that overrides pd.Series method.
See also
compare
Compare models based on PSIS-LOO loo or WAIC waic cross-validation.
waic
Compute the widely applicable information criterion.
plot_compare
Summary plot for model comparison.
plot_elpd
Plot pointwise elpd differences between two or more models.
plot_khat
Plot Pareto tail indices for diagnosing convergence.
Examples
Calculate LOO of a model:
In [1]: import arviz as az ...: data = az.load_arviz_data("centered_eight") ...: az.loo(data) ...: Out[1]: Computed from 2000 posterior samples and 8 observations log-likelihood matrix. Estimate SE elpd_loo -30.78 1.35 p_loo 0.95 - ------ Pareto k diagnostic values: Count Pct. (-Inf, 0.70] (good) 8 100.0% (0.70, 1] (bad) 0 0.0% (1, Inf) (very bad) 0 0.0%
Calculate LOO of a model and return the pointwise values:
In [2]: data_loo = az.loo(data, pointwise=True) ...: data_loo.loo_i ...: Out[2]: <xarray.DataArray 'loo_i' (school: 8)> Size: 64B array([-4.8918424 , -3.41965169, -3.86732498, -3.46497133, -3.47794644, -3.49926442, -4.20043549, -3.959389 ]) Coordinates: * school (school) <U16 512B 'Choate' 'Deerfield' ... 'Mt. Hermon'