arviz.ColumnDataSource#

arviz.ColumnDataSource(*args, **kwargs)[source]#

Wrap bokeh.models.ColumnDataSource.

Maps names of columns to sequences or arrays.

The ColumnDataSource is a fundamental data structure of Bokeh. Most plots, data tables, etc. will be driven by a ColumnDataSource.

If the ColumnDataSource initializer is called with a single argument that can be any of the following:

  • A Python dict that maps string names to sequences of values, e.g. lists, arrays, etc.

    data = {'x': [1,2,3,4], 'y': np.array([10.0, 20.0, 30.0, 40.0])}
    
    source = ColumnDataSource(data)
    

Note

ColumnDataSource only creates a shallow copy of data. Use e.g. ColumnDataSource(copy.deepcopy(data)) if initializing from another ColumnDataSource.data object that you want to keep independent.

  • A Pandas DataFrame object

    source = ColumnDataSource(df)
    

    In this case the CDS will have columns corresponding to the columns of the DataFrame. If the DataFrame columns have multiple levels, they will be flattened using an underscore (e.g. level_0_col_level_1_col). The index of the DataFrame will be flattened to an Index of tuples if it’s a MultiIndex, and then reset using reset_index. The result will be a column with the same name if the index was named, or level_0_name_level_1_name if it was a named MultiIndex. If the Index did not have a name or the MultiIndex name could not be flattened/determined, the reset_index function will name the index column index, or level_0 if the name index is not available.

  • A Pandas GroupBy object

    group = df.groupby(('colA', 'ColB'))
    

    In this case the CDS will have columns corresponding to the result of calling group.describe(). The describe method generates columns for statistical measures such as mean and count for all the non-grouped original columns. The CDS columns are formed by joining original column names with the computed measure. For example, if a DataFrame has columns 'year' and 'mpg'. Then passing df.groupby('year') to a CDS will result in columns such as 'mpg_mean'

    If the GroupBy.describe result has a named index column, then CDS will also have a column with this name. However, if the index name (or any subname of a MultiIndex) is None, then the CDS will have a column generically named index for the index.

    Note this capability to adapt GroupBy objects may only work with Pandas >=0.20.0.

Note

There is an implicit assumption that all the columns in a given ColumnDataSource all have the same length at all times. For this reason, it is usually preferable to update the .data property of a data source “all at once”.