Archive for completely random measures

JSM 2024, Portland, Day 3

Posted in pictures, Running, Statistics, Travel, University life with tags , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , on August 9, 2024 by xi'an

Bayesian contributed session as the first round of the third day (with a choice of five parallel sessions featuring Bayesian topics!!, actually easier to pick than among the following eight parallel sessions of the 10:30 schedule!!!), with a talk by Tahir Ekin on adversarial outlier detection that could connect with our Oceaner(c) privacy concerns. Then one involving spike & slab (a theme to figure prominently in this special day!!) in mixed response models by Sameer Deshpande, seeking a (unBayesian!) MAP for a latent variable model by Monte Carlo EM. Followed by a talk by Yunyi Shen on completely random measures for estimating the (distribution of the) number of species in heterogeneous populations. Next, Valentin Zulj on (frequentist rather than) Bayesian stacking, on estimating optimal weights for model averaging (which should be posterior probabilities in a pure Bayesian mindframe), including a score function that could lead to generalised Bayesian inference on said weights. Finishing with a talk by Chaegeun Song on correcting Bayesian credible sets towards (frequentist, again!!!) exact coverage for classification (which reminded me of my very first paper with George on correcting frequentist confidence for Binomial observations). With which I could not really engage as seeking a specific coverage level did not seem relevant, imho, but I appreciated the wheel plot representation.My second morn session was about modern (what else?!) sampling algorithms, although I spent the first dozen minutes wondering whether or not I had entered the wrong room. Until Tianhao Wang focussed on Thompson sampling for bandits. It did prove far enough from my interest for my (sleep deprived) attention to drift too quickly. Only the talk by Yuchen Wu on a spike & slab (as suits the day!) challenge captured enough this wandering attention. Crossing further into my realm of primary topics by considering a target distribution that is a product of distributions. But I did not get from her presentation how a product measure decomposition was inducing higher efficiency (and did not find answers within the arXived preprint). Unless it exploited specific features of the target, like conditional independence between the components. The last talk was by Brice Huang on sampling low temperature Gibbs measures using stochastic localisation.

After coming upon a row of food trucks across the conference centre and being unfairly attracted by an Ethiopian injera picture into a terrible wrap, I returned for the Skeptical about AI session, just a few minutes late, only to find accessing the session was impossible! Quite sad to miss the presentations and the arguments (even though I had heard a previous talk by Genevera Allen when visiting Rutgers two years ago). As a second best, I then joined the recent (of course!) Advances in Bayesian Computation (aka ABC?!) session with a medley of topics, including a data subset versus data sketching model reduction by Sudipto Saha. Which could have consequences on our privacy strategies. And marginal evidence estimation for the Bayesian Lasso by Christopher Hans while avoiding data completion. And another latent variable model with a sequential variational Bayes approach by Bao Anh Vu, using at one point Cappé et al. (2005) EM-based approximation to the log likelihood gradient. Finishing by a back-to-the-future talk by Luke Duttweiler on MCMC convergence diagnostics. Comparing several chains via proximity maps that themselves require some preliminary knowledge about the MCMC kernel. (Nice title though, “the traceplot thickens”!)The crux of the day was however the 2024 COPSS Award ceremony with several friends featuring among the recipients, Danielle Durante for the Emerging Leaders Award, Regina Liu for the Elizabeth L. Scott Award and Veronika Rockova for the Presidents’ Award. Congrats!!!



Approximation Methods in Bayesian Analysis [#3]

Posted in Mountains, pictures, Running, Statistics, Travel, University life with tags , , , , , , , , , , , , , , on June 23, 2023 by xi'an

My last day (#4) at the workshop, as I had to return to Paris earlier. A rather theoretical morning again, with Morgane Austern on (probabilistic) concentration inequalities on transport distances, far from my comfort zone if lively, Jason Xu on replacing non-convex penalisation factors to distances to the corresponding manifold, which I found most interesting if not directly helpful for simulating over submanifolds, and Hugo Lavenant on studying the impact of prior choice as merging of opinions, in the (Milanese) setting of completely random measures, with the surprise occurrence of a double bent for some choices. The afternoon session saw Andrew Gelman reflecting on multiscale modelling (sans slide et sans tableau) and Chris Holmes introduce the fundamentals of Bayesian conformal prediction, towards reaching well-calibrated (in a frequentist sense) Bayesian procedures by resorting to exchangeability and rank tests. I alas missed the other talks of the day.

In recap, this was a wonderful conference, with a perfect audience size, a diverse if intense program, and a lot of interactions. In addition, the short talk sessions worked very nicely, even at 22:10 after a long day. And attracted very strong audience, even at 22:10! Indeed, they were uniformly well-calibrated, time-wise, and with high clarity messages. To be repeated. As there were many newcomers to CIRM, they discovered the idiosyncrasies of the place and of its surrounding, mostly positively.

On the outdoor front, the week saw an overall moderately hot weather but a constant wind that prevented me from sleeping (well), but which helped with waking up before dawn to cycle or run to my open water pool! The sea remained reasonably choppy, so waves did not prevent my swimming.

latent nested nonparametric priors

Posted in Books, Statistics with tags , , , , , , , on September 23, 2019 by xi'an

A paper on an extended type of non-parametric priors by Camerlenghi et al. [all good friends!] is about to appear in Bayesian Analysis, with a discussion open for contributions (until October 15). While a fairly theoretical piece of work, it validates a Bayesian approach for non-parametric clustering of separate populations with, broadly speaking, common clusters. More formally, it constructs a new family of models that allows for a partial or complete equality between two probability measures, but does not force full identity when the associated samples do share some common observations. Indeed, the more traditional structures prohibit one or the other, from the Dirichlet process (DP) prohibiting two probability measure realisations from being equal or partly equal to some hierarchical DP (HDP) already allowing for common atoms across measure realisations, but prohibiting complete identity between two realised distributions, to nested DP offering one extra level of randomness, but with an infinity of DP realisations that prohibits common atomic support besides completely identical support (and hence distribution).

The current paper imagines two realisations of random measures written as a sum of a common random measure and of one of two separate almost independent random measures: (14) is the core formula of the paper that allows for partial or total equality. An extension to a setting larger than facing two samples seems complicated if only because of the number of common measures one has to introduce, from the totally common measure to measures that are only shared by a subset of the samples. Except in the simplified framework when a single and universally common measure is adopted (with enough justification). The randomness of the model is handled via different completely random measures that involved something like four degrees of hierarchy in the Bayesian model.

Since the example is somewhat central to the paper, the case of one or rather two two-component Normal mixtures with a common component (but with different mixture weights) is handled by the approach, although it seems that it was already covered by HDP. Having exactly the same term (i.e., with the very same weight) is not, but this may be less interesting in real life applications. Note that alternative & easily constructed & parametric constructs are already available in this specific case, involving a limited prior input and a lighter computational burden, although the  Gibbs sampler behind the model proves extremely simple on the paper. (One may wonder at the robustness of the sampler once the case of identical distributions is visited.)

Due to the combinatoric explosion associated with a higher number of observed samples, despite obvious practical situations,  one may wonder at any feasible (and possibly sequential) extension, that would further keep a coherence under marginalisation (in the number of samples). And also whether or not multiple testing could be coherently envisioned in this setting, for instance when handling all hospitals in the UK. Another consistency question covers the Bayes factor used to assess whether the two distributions behind the samples are or not identical. (One may wonder at the importance of the question, hopefully applied to more relevant dataset than the Iris data!)