arviz.rhat#
- arviz.rhat(data, *, var_names=None, method='rank', dask_kwargs=None)[source]#
Compute estimate of rank normalized splitR-hat for a set of traces.
The rank normalized R-hat diagnostic tests for lack of convergence by comparing the variance between multiple chains to the variance within each chain. If convergence has been achieved, the between-chain and within-chain variances should be identical. To be most effective in detecting evidence for nonconvergence, each chain should have been initialized to starting values that are dispersed relative to the target distribution.
- Parameters:
- data
obj
Any object that can be converted to an
arviz.InferenceData
object. Refer to documentation ofarviz.convert_to_dataset()
for details. At least 2 posterior chains are needed to compute this diagnostic of one or more stochastic parameters. For ndarray: shape = (chain, draw). For n-dimensional ndarray transform first to dataset withaz.convert_to_dataset
.- var_names
list
Names of variables to include in the rhat report
- method
str
Select R-hat method. Valid methods are: - “rank” # recommended by Vehtari et al. (2021) - “split” - “folded” - “z_scale” - “identity”
- dask_kwargs
dict
, optional Dask related kwargs passed to
wrap_xarray_ufunc()
.
- data
- Returns:
xarray.Dataset
Returns dataset of the potential scale reduction factors, \(\hat{R}\)
See also
ess
Calculate estimate of the effective sample size (ess).
mcse
Calculate Markov Chain Standard Error statistic.
plot_forest
Forest plot to compare HDI intervals from a number of distributions.
Notes
The diagnostic is computed by:
\[\hat{R} = \sqrt{\frac{\hat{V}}{W}}\]where \(W\) is the within-chain variance and \(\hat{V}\) is the posterior variance estimate for the pooled rank-traces. This is the potential scale reduction factor, which converges to unity when each of the traces is a sample from the target posterior. Values greater than one indicate that one or more chains have not yet converged.
Rank values are calculated over all the chains with
scipy.stats.rankdata
. Each chain is split in two and normalized with the z-transform following Vehtari et al. (2021).References
Vehtari et al. (2021). Rank-normalization, folding, and localization: An improved Rhat for assessing convergence of MCMC. Bayesian analysis, 16(2):667-718.
Gelman et al. BDA3 (2013)
Brooks and Gelman (1998)
Gelman and Rubin (1992)
Examples
Calculate the R-hat using the default arguments:
In [1]: import arviz as az ...: data = az.load_arviz_data("non_centered_eight") ...: az.rhat(data) ...: Out[1]: <xarray.Dataset> Size: 656B Dimensions: (school: 8) Coordinates: * school (school) <U16 512B 'Choate' 'Deerfield' ... 'Mt. Hermon' Data variables: mu float64 8B 1.003 theta_t (school) float64 64B 1.0 1.001 0.9997 1.001 ... 1.004 0.9992 1.002 tau float64 8B 1.003 theta (school) float64 64B 1.003 0.9992 1.003 1.001 ... 1.002 1.001 1.003
Calculate the R-hat of some variables using the folded method:
In [2]: az.rhat(data, var_names=["mu", "theta_t"], method="folded") Out[2]: <xarray.Dataset> Size: 584B Dimensions: (school: 8) Coordinates: * school (school) <U16 512B 'Choate' 'Deerfield' ... 'Mt. Hermon' Data variables: mu float64 8B 0.9997 theta_t (school) float64 64B 1.0 1.001 0.9997 1.001 ... 1.004 0.9992 1.002