arviz.hdi#

arviz.hdi(ary, hdi_prob=None, circular=False, multimodal=False, skipna=False, group='posterior', var_names=None, filter_vars=None, coords=None, max_modes=10, dask_kwargs=None, **kwargs)[source]#

Calculate highest density interval (HDI) of array for given probability.

The HDI is the minimum width Bayesian credible interval (BCI).

Parameters:

ary: obj: object containing posterior samples. Any object that can be converted to an arviz.InferenceData object. Refer to documentation of arviz.convert_to_dataset() for details.
hdi_prob: float, optional: Prob for which the highest density interval will be computed. Defaults to stats.hdi_prob rcParam.
circular: bool, optional: Whether to compute the hdi taking into account x is a circular variable (in the range [-np.pi, np.pi]) or not. Defaults to False (i.e non-circular variables). Only works if multimodal is False.
multimodal: bool, optional: If true it may compute more than one hdi if the distribution is multimodal and the modes are well separated.
skipna: bool, optional: If true ignores nan values when computing the hdi. Defaults to false.
group: str, optional: Specifies which InferenceData group should be used to calculate hdi. Defaults to ‘posterior’
var_names: list, optional: Names of variables to include in the hdi report. Prefix the variables by ~ when you want to exclude them from the report: ["~beta"] instead of ["beta"] (see arviz.summary() for more details).
filter_vars: {None, “like”, “regex”}, optional, default=None: If None (default), interpret var_names as the real variables names. If “like”, interpret var_names as substrings of the real variables names. If “regex”, interpret var_names as regular expressions on the real variables names. A la pandas.filter.
coords: mapping, optional: Specifies the subset over to calculate hdi.
max_modes: int, optional: Specifies the maximum number of modes for multimodal case.
dask_kwargsdict, optional: Dask related kwargs passed to wrap_xarray_ufunc().
kwargs: dict, optional: Additional keywords passed to wrap_xarray_ufunc().

Returns:

np.ndarray or xarray.Dataset, depending upon input: lower(s) and upper(s) values of the interval(s).

See also

plot_hdi: Plot highest density intervals for regression data.
xarray.Dataset.quantile: Calculate quantiles of array for given probabilities.

Examples

Calculate the HDI of a Normal random variable:

In [1]: import arviz as az
   ...: import numpy as np
   ...: data = np.random.normal(size=2000)
   ...: az.hdi(data, hdi_prob=.68)
   ...: 
Out[1]: array([-0.93827494,  0.9796453 ])

Calculate the HDI of a dataset:

In [2]: import arviz as az
   ...: data = az.load_arviz_data('centered_eight')
   ...: az.hdi(data)
   ...: 
Out[2]: 
<xarray.Dataset> Size: 720B
Dimensions:  (hdi: 2, school: 8)
Coordinates:
  * school   (school) <U16 512B 'Choate' 'Deerfield' ... 'Mt. Hermon'
  * hdi      (hdi) <U6 48B 'lower' 'higher'
Data variables:
    mu       (hdi) float64 16B -1.623 10.69
    theta    (school, hdi) float64 128B -4.564 17.13 -4.311 ... -5.858 16.01
    tau      (hdi) float64 16B 0.8965 9.668

We can also calculate the HDI of some of the variables of dataset:

In [3]: az.hdi(data, var_names=["mu", "theta"])
Out[3]: 
<xarray.Dataset> Size: 704B
Dimensions:  (hdi: 2, school: 8)
Coordinates:
  * school   (school) <U16 512B 'Choate' 'Deerfield' ... 'Mt. Hermon'
  * hdi      (hdi) <U6 48B 'lower' 'higher'
Data variables:
    mu       (hdi) float64 16B -1.623 10.69
    theta    (school, hdi) float64 128B -4.564 17.13 -4.311 ... -5.858 16.01

By default, hdi is calculated over the chain and draw dimensions. We can use the input_core_dims argument of wrap_xarray_ufunc() to change this. In this example we calculate the HDI also over the school dimension:

In [4]: az.hdi(data, var_names="theta", input_core_dims = [["chain","draw", "school"]])
Out[4]: 
<xarray.Dataset> Size: 64B
Dimensions:  (hdi: 2)
Coordinates:
  * hdi      (hdi) <U6 48B 'lower' 'higher'
Data variables:
    theta    (hdi) float64 16B -5.719 14.86

We can also calculate the hdi over a particular selection:

In [5]: az.hdi(data, coords={"chain":[0, 1, 3]}, input_core_dims = [["draw"]])
Out[5]: 
<xarray.Dataset> Size: 1kB
Dimensions:  (chain: 3, hdi: 2, school: 8)
Coordinates:
  * chain    (chain) int64 24B 0 1 3
  * school   (school) <U16 512B 'Choate' 'Deerfield' ... 'Mt. Hermon'
  * hdi      (hdi) <U6 48B 'lower' 'higher'
Data variables:
    mu       (chain, hdi) float64 48B -1.604 10.63 -1.537 10.19 -1.55 10.44
    theta    (chain, school, hdi) float64 384B -5.334 14.37 ... -3.435 14.64
    tau      (chain, hdi) float64 48B 0.9215 8.453 0.8965 9.788 0.9217 9.291