Thin kernel and sampling algorithm #791

hsimonfroy · 2025-05-18T21:30:06Z

PR discussed in #738

Add SamplingAlgorithm and kernel transformations making them thinned. They take a thinning integer and a SamplingAlgorithm/kernel and return the same SamplingAlgorithm/kernel but iterated thinning times.
This is useful to reduce computation and memory cost of high throughput samplers, especially in high dimension. While the thin_algorithm function operates on top_level_api SamplingAlgorithm, the thin_kernel version is relevant for adaptation algorithms. For instance, the estimation of autocorrelation length, for tuning momentum decoherence length in mclmc_adaptation, using the states from every step is computationally prohibitive in high dimension, see Subsampling for MCLMC tuning #738.
Both transformations have an additional info_transform Callable parameter that defines how to aggregate the sampler informations across the thinning steps. For instance, we might want to average the logdensities, and to rootmeansquare the energy_changes, which can be easily performed with tree.map or tree.map_with_path.

hsimonfroy · 2025-05-18T21:48:57Z

Here is an example of how thin_algorithm and thin_kernel could be used for MCLMC:

logdf = lambda x: -(x**2).sum()
init_pos = jnp.ones(2)
init_key, tune_key, run_key = jr.split(jr.key(42), 3)

state = blackjax.mcmc.mclmc.init(
            position=init_pos,
            logdensity_fn=logdf,
            rng_key=init_key
            )

kernel = lambda inverse_mass_matrix: thin_kernel(
            blackjax.mcmc.mclmc.build_kernel(
                                logdensity_fn=logdf,
                                integrator=isokinetic_mclachlan,
                                inverse_mass_matrix=inverse_mass_matrix,
                                ),
            thinning = 16
            # Adequately aggregate info.energy_change
            info_transform=lambda info: tree.map(lambda x: (x**2).mean()**.5, info)
            )

state, params, n_steps = blackjax.mclmc_find_L_and_step_size(
            mclmc_kernel=kernel,
            num_steps=100,
            state=state,
            rng_key=tune_key,
            )
    
sampler = blackjax.mclmc(
            logdensity_fn=logdf,
            L=params.L,
            step_size=params.step_size,
            inverse_mass_matrix=params.inverse_mass_matrix,
            )

sampler = thin_algorithm(
            sampler,
            thinning=16,
            info_transform=lambda info: tree.map(jnp.mean, info),
            )

state, history = run_inference_algorithm(
            rng_key=run_key,
            initial_state=state,
            inference_algorithm=sampler,
            num_steps=100,
            )

NB: I exposed the Lfactor=0.4 parameter in mclmc_find_L_and_step_size, because if thinning is too high, the computed ESS on the thinned samples would be bigger than on the non-thinned samples, leading to underestimating autocorrelation length and therefore L. This can simply be compensated by increasing Lfactor. In practice, during my tests, I only found minor changes (on L estimation, with vs. without thinning) for reasonable thinning values, and so I am not sure this option is necessary. Actually, one shouldn't perform thinning to the point that it deteriorates the ESS, since one could just make less sampling steps then.

junpenglao · 2025-05-19T04:31:36Z

Overall LGTM, could you add some test?

hsimonfroy · 2025-06-13T22:11:47Z

Ok, Python 3.12 test was passing a few weeks ago, smthg seems to have broken between pytest and xdist.

INTERNALERROR> pytest_benchmark.logger.PytestBenchmarkWarning: Benchmarks are automatically disabled because xdist plugin is active.Benchmarks cannot be performed reliably in a parallelized environment.

hsimonfroy · 2025-07-31T13:25:08Z

Hello @junpenglao and @reubenharry,
I had to remove the -n auto option in pytest workflow to solve for the pytest-xdist breaking on main branch. I don't know if you found another workaround.

Then I added test as ThinInferenceAlgorithmTest following the same chex format than RunInferenceAlgorithmTest.

hsimonfroy added 2 commits May 18, 2025 23:01

thin kernel and algorithm

d1e03c6

black formatting

dda4a02

hsimonfroy added 8 commits June 13, 2025 14:29

add tests

0fe86a2

black formatting

e51ae45

no vmap in test

05af474

test test

488674a

test test

51fa63c

test test

a947f84

test test

df05308

black formatting

0195760

hsimonfroy added 4 commits July 31, 2025 13:45

test test

6538298

pytest no -n auto

0b3731a

pytest no -n auto

1524f06

add test

dfa01b3

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Thin kernel and sampling algorithm #791

Thin kernel and sampling algorithm #791

Uh oh!

hsimonfroy commented May 18, 2025

Uh oh!

hsimonfroy commented May 18, 2025 •

edited

Loading

Uh oh!

junpenglao commented May 19, 2025

Uh oh!

hsimonfroy commented Jun 13, 2025

Uh oh!

hsimonfroy commented Jul 31, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Thin kernel and sampling algorithm #791

Are you sure you want to change the base?

Thin kernel and sampling algorithm #791

Uh oh!

Conversation

hsimonfroy commented May 18, 2025

Uh oh!

hsimonfroy commented May 18, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

junpenglao commented May 19, 2025

Uh oh!

hsimonfroy commented Jun 13, 2025

Uh oh!

hsimonfroy commented Jul 31, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

hsimonfroy commented May 18, 2025 •

edited

Loading