
arviz.hdi(ary, hdi_prob=None, circular=False, multimodal=False, skipna=False, group='posterior', var_names=None, filter_vars=None, coords=None, max_modes=10, dask_kwargs=None, **kwargs)[source]#

Calculate highest density interval (HDI) of array for given probability.

The HDI is the minimum width Bayesian credible interval (BCI).

ary: obj

object containing posterior samples. Any object that can be converted to an arviz.InferenceData object. Refer to documentation of arviz.convert_to_dataset() for details.

hdi_prob: float, optional

Prob for which the highest density interval will be computed. Defaults to stats.ci_prob rcParam.

circular: bool, optional

Whether to compute the hdi taking into account x is a circular variable (in the range [-np.pi, np.pi]) or not. Defaults to False (i.e non-circular variables). Only works if multimodal is False.

multimodal: bool, optional

If true it may compute more than one hdi if the distribution is multimodal and the modes are well separated.

skipna: bool, optional

If true ignores nan values when computing the hdi. Defaults to false.

group: str, optional

Specifies which InferenceData group should be used to calculate hdi. Defaults to ‘posterior’

var_names: list, optional

Names of variables to include in the hdi report. Prefix the variables by ~ when you want to exclude them from the report: ["~beta"] instead of ["beta"] (see arviz.summary() for more details).

filter_vars: {None, “like”, “regex”}, optional, default=None

If None (default), interpret var_names as the real variables names. If “like”, interpret var_names as substrings of the real variables names. If “regex”, interpret var_names as regular expressions on the real variables names. A la pandas.filter.

coords: mapping, optional

Specifies the subset over to calculate hdi.

max_modes: int, optional

Specifies the maximum number of modes for multimodal case.

dask_kwargsdict, optional

Dask related kwargs passed to wrap_xarray_ufunc().

kwargs: dict, optional

Additional keywords passed to wrap_xarray_ufunc().

np.ndarray or xarray.Dataset, depending upon input

lower(s) and upper(s) values of the interval(s).

See also


Plot highest density intervals for regression data.


Calculate quantiles of array for given probabilities.


Calculate the HDI of a Normal random variable:

In [1]: import arviz as az
   ...: import numpy as np
   ...: data = np.random.normal(size=2000)
   ...: az.hdi(data, hdi_prob=.68)
Out[1]: array([-1.04741511,  0.89983109])

Calculate the HDI of a dataset:

In [2]: import arviz as az
   ...: data = az.load_arviz_data('centered_eight')
   ...: az.hdi(data)
<xarray.Dataset> Size: 720B
Dimensions:  (hdi: 2, school: 8)
  * school   (school) <U16 512B 'Choate' 'Deerfield' ... 'Mt. Hermon'
  * hdi      (hdi) <U6 48B 'lower' 'higher'
Data variables:
    mu       (hdi) float64 16B -1.623 10.69
    theta    (school, hdi) float64 128B -4.564 17.13 -4.311 ... -5.858 16.01
    tau      (hdi) float64 16B 0.8965 9.668

We can also calculate the HDI of some of the variables of dataset:

In [3]: az.hdi(data, var_names=["mu", "theta"])
<xarray.Dataset> Size: 704B
Dimensions:  (hdi: 2, school: 8)
  * school   (school) <U16 512B 'Choate' 'Deerfield' ... 'Mt. Hermon'
  * hdi      (hdi) <U6 48B 'lower' 'higher'
Data variables:
    mu       (hdi) float64 16B -1.623 10.69
    theta    (school, hdi) float64 128B -4.564 17.13 -4.311 ... -5.858 16.01

By default, hdi is calculated over the chain and draw dimensions. We can use the input_core_dims argument of wrap_xarray_ufunc() to change this. In this example we calculate the HDI also over the school dimension:

In [4]: az.hdi(data, var_names="theta", input_core_dims = [["chain","draw", "school"]])
<xarray.Dataset> Size: 64B
Dimensions:  (hdi: 2)
  * hdi      (hdi) <U6 48B 'lower' 'higher'
Data variables:
    theta    (hdi) float64 16B -5.719 14.86

We can also calculate the hdi over a particular selection:

In [5]: az.hdi(data, coords={"chain":[0, 1, 3]}, input_core_dims = [["draw"]])
<xarray.Dataset> Size: 1kB
Dimensions:  (chain: 3, hdi: 2, school: 8)
  * chain    (chain) int64 24B 0 1 3
  * school   (school) <U16 512B 'Choate' 'Deerfield' ... 'Mt. Hermon'
  * hdi      (hdi) <U6 48B 'lower' 'higher'
Data variables:
    mu       (chain, hdi) float64 48B -1.604 10.63 -1.537 10.19 -1.55 10.44
    theta    (chain, school, hdi) float64 384B -5.334 14.37 ... -3.435 14.64
    tau      (chain, hdi) float64 48B 0.9215 8.453 0.8965 9.788 0.9217 9.291