arviz.plot_ecdf#
- arviz.plot_ecdf(values, values2=None, eval_points=None, cdf=None, difference=False, confidence_bands=False, ci_prob=None, num_trials=500, rvs=None, random_state=None, figsize=None, fill_band=True, plot_kwargs=None, fill_kwargs=None, plot_outline_kwargs=None, ax=None, show=None, backend=None, backend_kwargs=None, npoints=100, pointwise=False, fpr=None, pit=False, **kwargs)[source]#
Plot ECDF or ECDF-Difference Plot with Confidence bands.
Plots of the empirical cumulative distribution function (ECDF) of an array. Optionally, A
cdf
argument representing a reference CDF may be provided for comparison using a difference ECDF plot and/or confidence bands.Alternatively, the PIT for a single dataset may be visualized.
- Parameters:
- valuesarray_like
Values to plot from an unknown continuous or discrete distribution.
- values2array_like, optional
values to compare to the original sample.
Deprecated since version 0.18.0: Instead use
cdf=scipy.stats.ecdf(values2).cdf.evaluate
.- cdf
callable()
, optional Cumulative distribution function of the distribution to compare the original sample. The function must take as input a numpy array of draws from the distribution.
- differencebool, default
False
If True then plot ECDF-difference plot otherwise ECDF plot.
- confidence_bands
str
or bool False: No confidence bands are plotted (default).
True: Plot bands computed with the default algorithm (subject to change)
“pointwise”: Compute the pointwise (i.e. marginal) confidence band.
“optimized”: Use optimization to estimate a simultaneous confidence band.
“simulated”: Use Monte Carlo simulation to estimate a simultaneous confidence band.
For simultaneous confidence bands to be correctly calibrated, provide
eval_points
that are not dependent on thevalues
.- ci_prob
float
, default 0.94 The probability that the true ECDF lies within the confidence band. If
confidence_bands
is “pointwise”, this is the marginal probability instead of the joint probability.- eval_pointsarray_like, optional
The points at which to evaluate the ECDF. If None,
npoints
uniformly spaced points between the data bounds will be used.- rvs: callable, optional
A function that takes an integer
ndraws
and optionally the object passed torandom_state
and returns an array ofndraws
samples from the same distribution as the original dataset. Required ifmethod
is “simulated” and variable is discrete.- random_state
int
,numpy.random.Generator
ornumpy.random.RandomState
, optional - num_trials
int
, default 500 The number of random ECDFs to generate for constructing simultaneous confidence bands (if
confidence_bands
is “simulated”).- figsize(float,float), optional
Figure size. If
None
it will be defined automatically.- fill_bandbool, default
True
If True it fills in between to mark the area inside the confidence interval. Otherwise, plot the border lines.
- plot_kwargs
dict
, optional Additional kwargs passed to
matplotlib.pyplot.step()
orbokeh.plotting.figure.step()
- fill_kwargs
dict
, optional Additional kwargs passed to
matplotlib.pyplot.fill_between()
orbokeh:bokeh.plotting.Figure.varea()
- plot_outline_kwargs
dict
, optional Additional kwargs passed to
matplotlib.axes.Axes.plot()
orbokeh:bokeh.plotting.Figure.line()
- ax :axes, optional
Matplotlib axes or bokeh figures.
- showbool, optional
Call backend show function.
- backend{“matplotlib”, “bokeh”}, default “matplotlib”
Select plotting backend.
- backend_kwargs
dict
, optional These are kwargs specific to the backend being used, passed to
matplotlib.pyplot.subplots()
orbokeh.plotting.figure
. For additional documentation check the plotting method of the backend.- npoints
int
, default 100 The number of evaluation points for the ecdf or ecdf-difference plots, if
eval_points
is not provided orpit
isTrue
.Deprecated since version 0.18.0: Instead specify
eval_points=np.linspace(np.min(values), np.max(values), npoints)
unlesspit
isTrue
.- pointwisebool, default
False
Deprecated since version 0.18.0: Instead use
confidence_bands="pointwise"
.- fpr
float
, optional Deprecated since version 0.18.0: Instead use
ci_prob=1-fpr
.- pitbool, default
False
If True plots the ECDF or ECDF-diff of PIT of sample.
Deprecated since version 0.18.0: See below example instead.
- Returns:
- axes
matplotlib Axes
orBokeh Figure
- axes
Notes
This plot computes the confidence bands with the simulated based algorithm presented in [1].
References
[1]Säilynoja, T., Bürkner, P.C. and Vehtari, A. (2022). Graphical Test for Discrete Uniformity and its Applications in Goodness of Fit Evaluation and Multiple Sample Comparison. Statistics and Computing, 32(32).
Examples
In a future release, the default behaviour of
plot_ecdf
will change. To maintain the original behaviour you should do:>>> import arviz as az >>> import numpy as np >>> from scipy.stats import uniform, norm >>> >>> sample = norm(0,1).rvs(1000) >>> npoints = 100 >>> az.plot_ecdf(sample, eval_points=np.linspace(sample.min(), sample.max(), npoints))
However, seeing this warning isn’t an indicator of anything being wrong, if you are happy to get different behaviour as ArviZ improves and adds new algorithms you can ignore it like so:
>>> import warnings >>> warnings.filterwarnings("ignore", category=az.utils.BehaviourChangeWarning)
Plot an ECDF plot for a given sample evaluated at the sample points. This will become the new behaviour when
eval_points
is not provided:>>> az.plot_ecdf(sample, eval_points=np.unique(sample))
Plot an ECDF plot with confidence bands for comparing a given sample to a given distribution. We manually specify evaluation points independent of the values so that the confidence bands are correctly calibrated.
>>> distribution = norm(0,1) >>> eval_points = np.linspace(*distribution.ppf([0.001, 0.999]), 100) >>> az.plot_ecdf( >>> sample, eval_points=eval_points, >>> cdf=distribution.cdf, confidence_bands=True >>> )
Plot an ECDF-difference plot with confidence bands for comparing a given sample to a given distribution.
>>> az.plot_ecdf( >>> sample, cdf=distribution.cdf, >>> confidence_bands=True, difference=True >>> )
Plot an ECDF plot with confidence bands for the probability integral transform (PIT) of a continuous sample. If drawn from the reference distribution, the PIT values should be uniformly distributed.
>>> pit_vals = distribution.cdf(sample) >>> uniform_dist = uniform(0, 1) >>> az.plot_ecdf( >>> pit_vals, cdf=uniform_dist.cdf, confidence_bands=True, >>> )
Plot an ECDF-difference plot of PIT values.
>>> az.plot_ecdf( >>> pit_vals, cdf = uniform_dist.cdf, confidence_bands = True, >>> difference = True >>> )