importance sampling and independent Metropolis–Hastings with unbounded weights

George Deligiannidis, Pierre E. Jacob, El Mahdi Khribch, and Guanyang Wang just arXived a paper on the respective behaviours of importance sampling and independent Metropolis–Hastings (IMH) under the same proposal when the importance weight is unbounded but enjoys a p-th moment with p≥2. Both algorithms are sharing a lot, with importance sampling appearing as a rough Rao-Blackwellisation of Metropolis-Hastings, and its asymptotic variance being smaller than the asymptotic variance of Metropolis-Hastings. I was unable to check whether or not their conditions encompassed the highly interesting case when the integrand f is integrable under the target π, but not L²(π). (Theorem 2.3 does not seem to include this case.)

They consider a particular (!) version of Metropolis–Hastings (IMH) under the same proposal when the importance weight is unbounded but enjoys a p-th moment. Both algorithms are sharing a lot, with importance sampling appearing when N iid proposed values are drawn at once and accepted or rejected (again at once) with an acceptance ratio the average of the weights. Although this is already found in a 2010 paper by Christophe Andrieu and co-authors, and stem from an unbiased importance sampler, I was not aware of this version. My initial feeling (predictably) was pessimistic, but thinking about it, using the average weight brings into the sample simulations with small weights that would otherwise be discarded. Of course, a rejection proves N times more costly. But this is truly a form of Rao-Blackwellisation in the sense that it removes the weight variability to some extent (see p5) and it turns the outcome into an unbiased estimator. Despite the self-normalising behaviour! They also conclude that the rejection probability is at least c/√N on average (Remark 4.1).

“We show that the bias of self-normalized importance sampling is of order N −1, and we obtain new bounds on the moments of the error in importance sampling. We then consider IMH, and show that the common random numbers coupling is optimal. Using this coupling, we show that the total variation distance between IMH at iteration t and π decays as t^p-1.”

They also compare the biases in sampling importance resampling and independent Metropolis–Hastings, with the later getting the upper hand, but I do not see the justification in resampling when computing an integral. Since this does not a sample from the target, especially when the weights are unbounded, and adds to the variability of the estimator. They further propose a (telescopic) unbiased modification to the self-normalised importance sampling estimator, with an inefficiency twice as high. But a neat Rao-Blackwellisation trick brings it back to the same level!

This entry was posted on December 12, 2024 at 12:24 am and is filed under Books, Statistics with tags Adriana Rosenbluth, arXiv, finite variance, importance sampling, independent Metropolis-Hastings algorithm, infinite variance estimators, Metropolis-Hastings algorithm, Paris, pont Alexandre III, Rao-Blackwellisation, Seine, self-normalised importance sampling, unbiased estimation. You can follow any responses to this entry through the RSS 2.0 feed. You can leave a response, or trackback from your own site.

Xi'an's Og