Archive for Bayesian learning

interpretable Bayesian learning for physical and engineering sciences [06-10 July 2026]

Posted in Kids, Mountains, Statistics, Travel, University life with tags , , , , , , , , , , , , , on April 22, 2026 by xi'an

SEINE AI

Posted in pictures, Statistics, University life with tags , , , , , , , , , , , , , , , , , , , , , , on March 23, 2026 by xi'an

Ten days ago I took part in the SEINE AI 2026 workshop in Jouy-en-Josas, near Paris (homestead of HEC), organised by the Huawei Paris Research Center.. In which I was invited to speak, even though I felt sort of an outlier given the deeply machine-learning, entreprenarial orientation of the meeting, with its theme being Building the Agentic Future of ICT, given that I chose to present our most recent Bayesian adversarial privacy paper. Hence, I stood within a game-theoretic, Bayesian, formal landscape, presumably loosing most of the audience and keeping them away from their lunch!

Other speakers included Simon Lucas from Queen Mary London on Simulation-based AI, which I had trouble distinguishing from building a statistical model by goodness of fit (and using bandits used for update), while focussing on competing on some computer game challenges. And Volker Tresp from LMU München on a tensor brain model that he opposes to a Bayesian brain (with a related paper entitled Bayes or Heisenberg: Who(se) rules? which we discussed in general terms over lunch, namely Bayesian learning vs. quantum updating. And Michal Valko from INRIA Paris (and other companies), who went full blast against the Bradley-Terry model!, with a title of Nash and Nemirovski walk into a bar! With a half-time technique approximating Nash equilibria that reminded me of leapfrog. Much entertaining talk that further provided a game-theoretic transition to mine’s.

As an aside, I played yesterday with ChatGPT composing my talk slides out of our arXiv document and it proved a disaster, with hallucinations of results and concepts not in the paper and a complete mess of handling graphs, first creating generic, fake, unrelated pictures, then inserting actual graphs haphazardly throughout the slides. The sorry result I obviously did not use as the workshop did not seem the ideal place for this sort of prank! The actual version only recycles a few of its summarising slides. (With ye Norse farce proper colour choice!)

 

Scalable Monte Carlo for Bayesian Learning [not yet a book review]

Posted in Books, Statistics, University life with tags , , , , , , , , , , , , , , , , , , , , , on May 11, 2025 by xi'an

[strong] foundations of synthetic B’earning

Posted in Books, Statistics, University life with tags , , , , , , , , , , on July 15, 2024 by xi'an

I only recently read the (foundational!) paper on Foundations of Bayesian learnin from synthetic data by Harrison Wilde, Jack Jewson, Sebastian Volmer [all associated with Warwick at some point] and [my long time friend]  Chris Holmes, that merges Bayesian inference with differential privacy constraints thru generalised / Gibbs interface. Recouping with the M-open perspective in order to accommodate the misspecified nature of synthetic data. I like the approach very much in that it intersects a lot with my own views, excepts for following the differential privacy formalism. I however think that further progress could be made by adopting an even more Bayesian position.

Their key messages from that paper are that

  1. learning from synthetic data may prove damaging to your (data) health
  2. robustness unsurprisingly reduces the odds or magnitude of the damage
  3. real data can still be used to some extent

Since the (synthetic) generating model can be a GAN, the privacy requirement is such that noise is “injected” in the input data and in the learning mechanism. This is not discussed in the paper but highly conservative constraints surely make the DGP loose several learning points.

On the side, I also like the alternative of opposing data keeper and learner rather than data owner and adversary. Learning here means taking an optimal B decision about the actual data averaged over the true DGP. With a prior on the distribution of the actual data, while being unable to avoid misspecification in representing the (marginal) synthetic generation model.

Unsurprisingly, the alternative approach is relying on proper scoring rules as Bissiri et al. (2016). Rather than finding the distribution KL closest to the synthetic generative model, robustified by generalised B inference, either via downweighting or via ß-divergence. With a preference for the latter. Interestingly, the authors consider the optimal learning size for the synthetic data. Since bringing in more synthetic data does not mean better performances.

step-dads with Bayesian design [One World ABC’minar, 21 March]

Posted in Books, Statistics, University life with tags , , , , , , , , , , , , , on March 18, 2024 by xi'an

The next One World ABC seminar is taking place (on-line, requiring pre-registration) on Thursday 21 March, 9:00am UK time, with Desi Ivanova (University of Oxford), speaking about Step-DAD: Semi-Amortized Policy-Based Bayesian Experimental Design:

We develop a semi-amortized, policy-based, approach to Bayesian experimental design (BED) called Step-wise Deep Adaptive Design (Step-DAD). Like existing, fully amortized, policy-based BED approaches, Step-DAD trains a design policy upfront before the experiment. However, rather than keeping this policy fixed, Step-DAD periodically updates it as data is gathered, refining it to the particular experimental instance. This allows it to improve both the adaptability and the robustness of the design strategy compared with existing approaches.

(Which reminded me of George’s book on design in 2008.)