arviz.ess#
- arviz.ess(data, *, var_names=None, method='bulk', relative=False, prob=None, dask_kwargs=None)[source]#
Calculate estimate of the effective sample size (ess).
- Parameters
- dataobj
Any object that can be converted to an
arviz.InferenceData
object. Refer to documentation ofarviz.convert_to_dataset()
for details. For ndarray: shape = (chain, draw). For n-dimensional ndarray transform first to dataset witharviz.convert_to_dataset()
.- var_namesstr or list of str
Names of variables to include in the return value Dataset.
- methodstr, optional, default “bulk”
Select ess method. Valid methods are:
“bulk”
“tail” # prob, optional
“quantile” # prob
“mean” (old ess)
“sd”
“median”
“mad” (mean absolute deviance)
“z_scale”
“folded”
“identity”
“local”
- relativebool
Return relative ess
ress = ess / n
- probfloat, or tuple of two floats, optional
probability value for “tail”, “quantile” or “local” ess functions.
- dask_kwargsdict, optional
Dask related kwargs passed to
wrap_xarray_ufunc()
.
- Returns
- xarray.Dataset
Return the effective sample size, \(\hat{N}_{eff}\)
See also
arviz.rhat
Compute estimate of rank normalized splitR-hat for a set of traces.
arviz.mcse
Calculate Markov Chain Standard Error statistic.
plot_ess
Plot quantile, local or evolution of effective sample sizes (ESS).
arviz.summary
Create a data frame with summary statistics.
Notes
The basic ess (\(N_{\mathit{eff}}\)) diagnostic is computed by:
\[\hat{N}_{\mathit{eff}} = \frac{MN}{\hat{\tau}}\]\[\hat{\tau} = -1 + 2 \sum_{t'=0}^K \hat{P}_{t'}\]where \(M\) is the number of chains, \(N\) the number of draws, \(\hat{\rho}_t\) is the estimated _autocorrelation at lag \(t\), and \(K\) is the last integer for which \(\hat{P}_{K} = \hat{\rho}_{2K} + \hat{\rho}_{2K+1}\) is still positive.
The current implementation is similar to Stan, which uses Geyer’s initial monotone sequence criterion (Geyer, 1992; Geyer, 2011).
References
Vehtari et al. (2019) see https://arxiv.org/abs/1903.08008
https://mc-stan.org/docs/2_18/reference-manual/effective-sample-size-section.html Section 15.4.2
Gelman et al. BDA (2014) Formula 11.8
Examples
Calculate the effective_sample_size using the default arguments:
In [1]: import arviz as az ...: data = az.load_arviz_data('non_centered_eight') ...: az.ess(data) ...: Out[1]: <xarray.Dataset> Dimensions: (school: 8) Coordinates: * school (school) object 'Choate' 'Deerfield' ... "St. Paul's" 'Mt. Hermon' Data variables: mu float64 2.354e+03 theta_t (school) float64 2.215e+03 3.159e+03 ... 2.678e+03 2.522e+03 tau float64 1.268e+03 theta (school) float64 2.298e+03 2.434e+03 ... 2.174e+03 2.278e+03
Calculate the ress of some of the variables
In [2]: az.ess(data, relative=True, var_names=["mu", "theta_t"]) Out[2]: <xarray.Dataset> Dimensions: (school: 8) Coordinates: * school (school) object 'Choate' 'Deerfield' ... "St. Paul's" 'Mt. Hermon' Data variables: mu float64 1.177 theta_t (school) float64 1.107 1.579 1.463 1.258 1.156 1.277 1.339 1.261
Calculate the ess using the “tail” method, leaving the prob argument at its default value.
In [3]: az.ess(data, method="tail") Out[3]: <xarray.Dataset> Dimensions: (school: 8) Coordinates: * school (school) object 'Choate' 'Deerfield' ... "St. Paul's" 'Mt. Hermon' Data variables: mu float64 1.401e+03 theta_t (school) float64 1.45e+03 1.514e+03 ... 1.207e+03 1.589e+03 tau float64 900.0 theta (school) float64 1.445e+03 1.506e+03 ... 1.433e+03 1.418e+03