Working with InferenceData¶

Here we present a collection of common manipulations you can use while working with InferenceData.

import arviz as az

Obtain a NumPy array for a given parameter¶

Let’s say we want to get the values for mu as a NumPy array.

stacked.mu.values

array([-3.47698606, -2.45587061, -2.82625433, ...,  4.59705819,
        5.89850592,  0.16138927])

Get the number of variables¶

Let’s check how many groups are in our hierarchical model.

len(idata.observed_data.school)

Get the variables’ names¶

What are the names of the groups in our hierarchical model?

idata.observed_data.school.values

array(['Choate', 'Deerfield', 'Phillips Andover', 'Phillips Exeter',
       'Hotchkiss', 'Lawrenceville', "St. Paul's", 'Mt. Hermon'],
      dtype=object)

Remove the first n draws (burn-in)¶

Let’s say we want to remove the first 100 samples, from all the chains and all InferenceData groups with draws.

burnin = idata.sel(draw=slice(100, None))

If you check the burnin object you will see that the groups posterior, posterior_predictive, prior and sample_stats have 400 draws compared to idata that has 500. The group observed_data has not been affected because it does not have the draw dimension. Alternatively, you can specify which group or groups you want to change.

burnin_posterior = idata.sel(draw=slice(100, None), groups="posterior")

Compute posterior mean values along draw and chains dimensions¶

If you want to compute the mean value of the posterior samples, you can simply do the following:

This will effectively compute the mean along all dimensions. This is probably what you want for mu and tau, which have two dimensions (chain and draw), but maybe not what you expected for theta, which has one more dimension school. You can specify along which dimension you want to compute the mean (or other functions).

Converting emcee objects to InferenceData

Example gallery