arviz_stats.thin#
- arviz_stats.thin(data, sample_dims='draw', group='posterior', var_names=None, filter_vars=None, coords=None, factor='auto', chain_axis=0, draw_axis=1)[source]#
Perform thinning.
Thinning refers to retaining only every nth sample from a Markov Chain Monte Carlo (MCMC) simulation. This is usually done to reduce autocorrelation in the stored samples or simply to reduce the size of the stored samples.
- Parameters:
- dataarray_like,
xarray.DataArray,xarray.Dataset,xarray.DataTree,DataArrayGroupBy,DatasetGroupBy, or idata-like Input data. It will have different pre-processing applied to it depending on its type:
array-like: call array layer within
arviz-stats.xarray object: apply dimension aware function to all relevant subsets
others: passed to
arviz_base.convert_to_dataset
- sample_dimsiterable of
hashable, optional Dimensions to be considered sample dimensions and are to be reduced. Default
rcParams["data.sample_dims"].- group
hashable, default “posterior” Group on which to compute the ESS.
- var_names
strorlistofstr, optional Names of the variables for which the ess should be computed.
- filter_vars{
None, “like”, “regex”}, defaultNone - coords
dict, optional Dictionary of dimension/index names to coordinate values defining a subset of the data for which to perform the computation.
- factor
strorint, default “auto” The thinning factor. If “auto”, the thinning factor is computed based on bulk and tail effective sample size as suggested by Säilynoja et al. (2022) [1]. If an integer, the thinning factor is set to that value.
- chain_axis, draw_axis
int, optional Integer indicators of the axis that correspond to the chain and the draw dimension. chain_axis can be
None.
- dataarray_like,
- Returns:
ndarray,xarray.DataArray,xarray.Dataset,xarray.DataTreeThinned samples
References
[1]Säilynoja, T., Bürkner, PC. & Vehtari, A. “Graphical test for discrete uniformity and its applications in goodness-of-fit evaluation and multiple sample comparison.” Statistics and Computing 32(2), 32 (2022). https://doi.org/10.1007/s11222-022-10090-6
Examples
Thin the posterior samples using the default arguments:
In [1]: from arviz_base import load_arviz_data ...: import arviz_stats as azs ...: data = load_arviz_data('non_centered_eight') ...: azs.thin(data) ...: Out[1]: <xarray.DataTree 'posterior'> Group: /posterior Dimensions: (chain: 4, draw: 250, school: 8) Coordinates: * chain (chain) int64 32B 0 1 2 3 * draw (draw) int64 2kB 0 2 4 6 8 10 12 14 ... 486 488 490 492 494 496 498 * school (school) <U16 512B 'Choate' 'Deerfield' ... 'Mt. Hermon' Data variables: mu (chain, draw) float64 8kB 6.339 5.168 2.881 ... 7.11 0.4987 1.325 theta_t (chain, draw, school) float64 64kB -0.9553 -1.162 ... -0.4102 tau (chain, draw) float64 8kB 2.574 1.832 2.353 ... 0.9896 8.401 2.333 theta (chain, draw, school) float64 64kB 3.88 3.348 6.233 ... 0.138 0.368
Thin a subset of the variables with a thinning factor of 10:
In [2]: azs.thin(data, factor=10, var_names=["mu"]) Out[2]: <xarray.DataTree 'posterior'> Group: /posterior Dimensions: (chain: 4, draw: 50) Coordinates: * chain (chain) int64 32B 0 1 2 3 * draw (draw) int64 400B 0 10 20 30 40 50 60 ... 440 450 460 470 480 490 Data variables: mu (chain, draw) float64 2kB 6.339 4.818 0.002546 ... -2.524 2.197