Archive for noninformative priors
on(-line) integral priors for model selection
Posted in Books, Statistics, University life with tags Bayes factor, Bayesian model selection, collaboration, ergodicity, improper priors, integral priors, International Statistical Review, ISI, Juan Antonio Cano, Markov chains, MCMC, noninformative priors, open access, paper, reference priors on February 27, 2026 by xi'anintegral priors for model comparison [2.0]
Posted in Books, Statistics, University life with tags arXiv, Bayesian inference, Bayesian model choice, Bayesian statistics, ergodicity, imaginary training sample, integral priors, Juan Antonio Cano, Markov chain, MCMC, Murcia, nested models, noninformative priors, reference priors, Spain, Universidad de Murcia on May 2, 2025 by xi'anintegral priors for multiple comparison
Posted in Books, Statistics, University life with tags ergodicity, imaginary training sample, improper posteriors, improper priors, intrinsic Bayes factor, Juan Antonio Cano, Markov chain, noninformative priors, null recurrence, pseudo-Bayes factors, reference priors, transience on June 24, 2024 by xi'an
Diego Salmerón and I just arXived a paper on integral priors for multiple model comparison, about deriving reference priors for multiple hypothesis testing. As (so-called) noninformative priors constructed for estimation purposes are usually not appropriate for model selection and testing due to their improperness, Jeffreys-Lindley paradoxes and the like, the methodology of integral priors was developed to get prior distributions for Bayesian model selection when comparing two models, modifying initial improper reference priors. This paper proposes a generalization of this methodology when than two models are to be compared. In order to avoid the above paradoxes and the associated possibility of producing a null recurrent or transient Markov chain, our approach adds an artificial copy of each model under comparison by compactifying the corresponding parametric space and creates an ergodic Markov chain exploring all models that returns the integral priors as marginals of the ergodic and stationary joint distribution. Besides the guarantee of existence of these integral priors and the disappearance of paradoxes that plague estimation reference priors, an additional perk of this methodology is that the simulation of this Markov chain is straightforward as it only requires simulations of imaginary training samples and from the corresponding posterior distributions, for all models, while producing Bayes factor approximations on the side. This renders its implementation automatic and generic, both in the nested and in the nonnested cases. We associated our late friend Juan Antonio Cano to this paper as he was instrumental in initiating both this collaboration and the methodology at its core.
a case for Bayesian deep learnin
Posted in Books, pictures, Statistics, Travel, University life with tags Bayesian foundations, Bayesian model choice, Bayesian neural networks, Bayesian variable selection, Berlin Tegel flughafen, marginalisation, model uncertainty, noninformative priors, normalisation, objective Bayes, snowstorm on September 30, 2020 by xi'an
Andrew Wilson wrote a piece about Bayesian deep learning last winter. Which I just read. It starts with the (posterior) predictive distribution being the core of Bayesian model evaluation or of model (epistemic) uncertainty.
“On the other hand, a flat prior may have a major effect on marginalization.”
Interesting sentence, as, from my viewpoint, using a flat prior is a no-no when running model evaluation since the marginal likelihood (or evidence) is no longer a probability density. (Check Lindley-Jeffreys’ paradox in this tribune.) The author then goes for an argument in favour of a Bayesian approach to deep neural networks for the reason that data cannot be informative on every parameter in the network, which should then be integrated out wrt a prior. He also draws a parallel between deep ensemble learning, where random initialisations produce different fits, with posterior distributions, although the equivalent to the prior distribution in an optimisation exercise is somewhat vague.
“…we do not need samples from a posterior, or even a faithful approximation to the posterior. We need to evaluate the posterior in places that will make the greatest contributions to the [posterior predictive].”
The paper also contains an interesting point distinguishing between priors over parameters and priors over functions, ony the later mattering for prediction. Which must be structured enough to compensate for the lack of data information about most aspects of the functions. The paper further discusses uninformative priors (over the parameters) in the O’Bayes sense as a default way to select priors. It is however unclear to me how this discussion accounts for the problems met in high dimensions by standard uninformative solutions. More aggressively penalising priors may be needed, as those found in high dimension variable selection. As in e.g. the 10⁷ dimensional space mentioned in the paper. Interesting read all in all!
how can a posterior be uniform?
Posted in Books, Statistics with tags cross validated, inverse cdf, Laplace's Demon, Laplace's prior, noninformative priors, prior distributions, uniform distribution on September 1, 2020 by xi'an
A bemusing question from X validated:
How can we have a posterior distribution that is a uniform distribution?
With the underlying message that a uniform distribution does not depend on the data, since it is uniform! While it is always possible to pick the parameterisation a posteriori so that the posterior is uniform, by simply using the inverse cdf transform, or to pick the prior a posteriori so that the prior cancels the likelihood function, there exist more authentic discrete examples of a data realisation leading to a uniform distribution, as eg in the Multinomial model. I deem the confusion to stem from the impression either that uniform means non-informative (what we could dub Laplace’s daemon!) or that it could remain uniform for all realisations of the sampled rv.

