Archive for marginal likelihood

“Approximating evidence via bounded harmonic means” is out! [in Statistics and Computing]

Posted in Books, R, Statistics, University life with tags , , , , , , , , , , , on April 24, 2026 by xi'an

ECMLE on CRAN

Posted in R, Statistics, University life with tags , , , , , , , , , , on March 27, 2026 by xi'an

x

When Is Generalized Bayes Bayesian?

Posted in Statistics, University life, Books with tags , , , , , , , , on February 13, 2026 by xi'an

I spotted this title in the new arXiv postings on Monday. When Is Generalized Bayes Bayesian? A Decision-Theoretic Characterization of Loss-Based Updating by Kenichiro McAlinn  & Kōsaku Takanashi is discussing decision-theoretic consequences of generalized Bayes approaches based on losses and show that decisions based on a loss-based posterior coincides with those of ordinary Bayes if and only if the loss is essentially a negative log-likelihood (leading to a belief posterior). This is not very surprising in that, otherwise, there is no Bayesian update delivering the generalised Bayes pseudo-posteriors (which can be traced back to a 2007 result of Catoni). The authors also demonstrate that generalized marginal likelihoods are not delivering evidence for decision posteriors, and thus that Bayes factors are not well-defined in this context, which reminds me of our warning for ABC model choice. However, the reason here is much more mundane, as it is due to the decision posterior failing to identify the normalising constant Z(x). Outside belief posteriors. The paper concludes with a coherence book, which is a table reproduced above.

estimating evidence redux

Posted in Books, Statistics, University life with tags , , , , , , , , on November 21, 2025 by xi'an

Following our arXival on the new version of our HPD based Gelfand & Dey estimator of evidence, I got pointed at Wang et al. (2018), which I had forgotten I had read at the time (as testified by an ‘Og entry). Reading my own comments, I concur (with myself¹⁸!) that the method is not massively compelling since it requires a partition set that is strongly related with the targeted integral. The above illustration for a mixture, that is for a pseudo posterior that is a mixture with two Gaussian components with known variance, also shows (in reverse) the curse of dimension and the need for finely tuned partitions. Said partition corresponding to the myriad of sets on the rhs. With such a degree of partitioning, Riemann integration should also produce perfect estimate, as shown by the zero error in the resulting estimator (Table 4).

model uncertainty and missing data: an objective BAyesian perspective

Posted in Books, Statistics, Travel, University life with tags , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , on September 16, 2025 by xi'an

My Spanish and objective Bayesian friends Gonzalo García-Donato, María Eugenia Castellanos, Stefano Cabras, Alicia Quirós, and Anabel Forte wrote an fairly exciting paper in BA that is open to discussion (for a few more days), to be discussed on 05 November (4:00 PM UTC | 11:00 AM EST | 5:00 PM CET).

The interplay between missing data and model uncertainty—two classic statistical problems—leads to primary questions that we formally address from an objective Bayesian perspective. For the general regression problem, we discuss the probabilistic justification of Rubin’s rules applied to the usual components of Bayesian variable selection, arguing that prior predictive marginals should be central to the pursued methodology. In the regression settings, we explore the conditions of prior distributions that make the missing data mechanism ignorable, provided that it is missing at random or completely at random. Moreover, when comparing multiple linear models, we provide a complete methodology for dealing with special cases, such as variable selection or uncertainty regarding model errors. In numerous simulation experiments, we demonstrate that our method outperforms or equals others, in consistently producing results close to those obtained using the full dataset. In general, the difference increases with the percentage of missing data and the correlation between the variables used for imputation.

The so-called Rubin’s identity is simply the representation of the posterior probability of a model γ given the observed data x⁰, p(γ|x⁰), as the integrated posterior probability of a model given both observed and latent data,  p(γ|x⁰, x¹), against the marginal of latent x¹ given observed x⁰. Since this marginal involves the probabilities p(γ|x⁰), this representation is not directly useful for a numerical implementation.

In this paper, missingness relates to some entries of either the covariates or the response variate. Which is less common but more realistic, especially if some covariates do not contribute to the response. (The missingness mechanism does not matter if the data is missing at random (à la Rubin). The computational solution (p9) is rather standard, simulating the missing variables given the observed variables. In my opinion, the elephant in the room is the super-delicate selection of a prior distribution on the missing covariates, as methinks this impacts in a considerable manner the actual value of the Bayes factor, hence the selection of the surviving model. (As a side remark, we are credited in Celeux et al. (2006) to have “extended DIC for missing data models or when missing data were present”, but our point was instead to point out the arbitrariness of the very definition of DIC in such contexts.)

“The standard Bayesian method for addressing the absence of prior information uses improper distributions. In estimation problems (the model is fixed), the impropriety of priors does not imply any additional difficulty as long as the posterior is proper” (p9)

The authors point out the well-known difficulty with improper priors but still resort to improper priors on the parameters shared by all models—which I dispute as being adequate, despite the arguments put forward on p15, right Haar measure or not—, while sticking to proper priors on the model-dependent parameters. Which unsurprisingly become Zellner’s g-priors. Or rather g’-priors, although the discussion seems to resolve into the (model-free) factor g’ being equal to 1 as for the g-priors. Again a strong term in the derivation of the Bayes factor.