arviz.ColumnDataSource#
- arviz.ColumnDataSource(*args, **kwargs)[source]#
Wrap bokeh.models.ColumnDataSource.
Maps names of columns to sequences or arrays.
The
ColumnDataSource
is a fundamental data structure of Bokeh. Most plots, data tables, etc. will be driven by aColumnDataSource
.If the
ColumnDataSource
initializer is called with a single argument that can be any of the following:A Python
dict
that maps string names to sequences of values, e.g. lists, arrays, etc.data = {'x': [1,2,3,4], 'y': np.array([10.0, 20.0, 30.0, 40.0])} source = ColumnDataSource(data)
Note
ColumnDataSource
only creates a shallow copy ofdata
. Use e.g.ColumnDataSource(copy.deepcopy(data))
if initializing from anotherColumnDataSource.data
object that you want to keep independent.A Pandas
DataFrame
objectsource = ColumnDataSource(df)
In this case the CDS will have columns corresponding to the columns of the
DataFrame
. If theDataFrame
columns have multiple levels, they will be flattened using an underscore (e.g. level_0_col_level_1_col). The index of theDataFrame
will be flattened to anIndex
of tuples if it’s aMultiIndex
, and then reset usingreset_index
. The result will be a column with the same name if the index was named, or level_0_name_level_1_name if it was a namedMultiIndex
. If theIndex
did not have a name or theMultiIndex
name could not be flattened/determined, thereset_index
function will name the index columnindex
, orlevel_0
if the nameindex
is not available.A Pandas
GroupBy
objectgroup = df.groupby(('colA', 'ColB'))
In this case the CDS will have columns corresponding to the result of calling
group.describe()
. Thedescribe
method generates columns for statistical measures such asmean
andcount
for all the non-grouped original columns. The CDS columns are formed by joining original column names with the computed measure. For example, if aDataFrame
has columns'year'
and'mpg'
. Then passingdf.groupby('year')
to a CDS will result in columns such as'mpg_mean'
If the
GroupBy.describe
result has a named index column, then CDS will also have a column with this name. However, if the index name (or any subname of aMultiIndex
) isNone
, then the CDS will have a column generically namedindex
for the index.Note this capability to adapt
GroupBy
objects may only work with Pandas>=0.20.0
.
Note
There is an implicit assumption that all the columns in a given
ColumnDataSource
all have the same length at all times. For this reason, it is usually preferable to update the.data
property of a data source “all at once”.