arviz_stats.thin

arviz_stats.thin#

arviz_stats.thin(data, sample_dims='draw', group='posterior', var_names=None, filter_vars=None, coords=None, factor='auto', chain_axis=0, draw_axis=1)[source]#

Perform thinning.

Thinning refers to retaining only every nth sample from a Markov Chain Monte Carlo (MCMC) simulation. This is usually done to reduce autocorrelation in the stored samples or simply to reduce the size of the stored samples.

Parameters:

dataarray_like, xarray.DataArray, xarray.Dataset, xarray.DataTree, DataArrayGroupBy, DatasetGroupBy, or idata-like

Input data. It will have different pre-processing applied to it depending on its type:

array-like: call array layer within arviz-stats.
xarray object: apply dimension aware function to all relevant subsets
others: passed to arviz_base.convert_to_dataset

sample_dimsiterable of hashable, optional

Dimensions to be considered sample dimensions and are to be reduced. Default rcParams["data.sample_dims"].

grouphashable, default “posterior”

Group on which to compute the ESS.

var_namesstr or list of str, optional

Names of the variables for which the ess should be computed.

filter_vars{None, “like”, “regex”}, default None

coordsdict, optional

Dictionary of dimension/index names to coordinate values defining a subset of the data for which to perform the computation.

factorstr or int, default “auto”

The thinning factor. If “auto”, the thinning factor is computed based on bulk and tail effective sample size as suggested by Säilynoja et al. (2022) [1]. If an integer, the thinning factor is set to that value.

chain_axis, draw_axisint, optional

Integer indicators of the axis that correspond to the chain and the draw dimension. chain_axis can be None.

Returns:

ndarray, xarray.DataArray, xarray.Dataset, xarray.DataTree: Thinned samples

References

[1]

Säilynoja, T., Bürkner, PC. & Vehtari, A. “Graphical test for discrete uniformity and its applications in goodness-of-fit evaluation and multiple sample comparison.” Statistics and Computing 32(2), 32 (2022). https://doi.org/10.1007/s11222-022-10090-6

Examples

Thin the posterior samples using the default arguments:

In [1]: from arviz_base import load_arviz_data
   ...: import arviz_stats as azs
   ...: data = load_arviz_data('non_centered_eight')
   ...: azs.thin(data)
   ...: 
Out[1]: 
<xarray.DataTree 'posterior'>
Group: /posterior
    Dimensions:  (chain: 4, draw: 250, school: 8)
    Coordinates:
      * chain    (chain) int64 32B 0 1 2 3
      * draw     (draw) int64 2kB 0 2 4 6 8 10 12 14 ... 486 488 490 492 494 496 498
      * school   (school) <U16 512B 'Choate' 'Deerfield' ... 'Mt. Hermon'
    Data variables:
        mu       (chain, draw) float64 8kB 6.339 5.168 2.881 ... 7.11 0.4987 1.325
        theta_t  (chain, draw, school) float64 64kB -0.9553 -1.162 ... -0.4102
        tau      (chain, draw) float64 8kB 2.574 1.832 2.353 ... 0.9896 8.401 2.333
        theta    (chain, draw, school) float64 64kB 3.88 3.348 6.233 ... 0.138 0.368

Thin a subset of the variables with a thinning factor of 10:

In [2]: azs.thin(data, factor=10, var_names=["mu"])
Out[2]: 
<xarray.DataTree 'posterior'>
Group: /posterior
    Dimensions:  (chain: 4, draw: 50)
    Coordinates:
      * chain    (chain) int64 32B 0 1 2 3
      * draw     (draw) int64 400B 0 10 20 30 40 50 60 ... 440 450 460 470 480 490
    Data variables:
        mu       (chain, draw) float64 2kB 6.339 4.818 0.002546 ... -2.524 2.197

arviz_stats.thin

Contents

arviz_stats.thin#