Archive for astrostatistics

Tanuka Chattopadhyay (26 Jan 1963 – 16 Oct 2023)

Posted in Statistics, Travel, University life with tags , , , , , , , , , , , , , , on November 25, 2023 by xi'an

Most sadly, I learned today that applied mathematics Professor Tanuka Chattopadhyay, from the University of Calcutta, had passed away last month. We had been briefly collaborating after I met her and her husband Asis Chattopadhyay, also Professor of Statistics at the University of Calcutta, following a Franco-Indian workshop in Bangalore. Discussing research directions in astrostatistics in Kolkata and later in Paris, when they visited. We had not been in touch for a few years and I was not aware she was severely ill. Dedicated researcher and mentor, as well as contributing to the administration of the University in many ways, she was versed in classical Indian culture and I will keep the memory of our conversations, esp. the one on New Year’s Ewe in Kolkata.

What follows is an obituary written by Professor Ajit Kembhavi.

Professor Tanuka Chattopadhyay (née Kanjilal), a member of the Astronomical Society of India, passed away on October 16th this year, at the age of 60, after a long and heroic battle with cancer.  She was a Professor in  the Department of Applied Mathematics of the University of Calcutta.  She  has been Head of the Department and Dean, Faculty of Science, Calcutta University. At her demise she was  Director, Center for Research in Nanoscience and Nanotechnology of the University.

Tanuka was from  an early age very interested in mathematics. To take that interest further, she joined Presidency College, Kolkata from where she completed her degree  in  Mathematics with  Honours in 1983. For post graduate studies she joined the Department of Applied Mathematics  where she got interested in Astrophysics. She later worked for a  Ph.D. under the supervision of Professor Baidyanath Basu on Explosion in the central region of the Galaxy, the consequent star formation, formation of molecular clouds and structure of shocks.

Over the years Tanuka worked on a variety of topics. Her earliest  papers were on  squalls, thunderstorms and convective instability.  From these terrestrial considerations her interest soon shifted to molecular clouds, star formation and density waves in the Galaxy. In later years she worked on globular clusters, galaxies with special interest in dwarf and ultra-compact galaxies, Gamma-ray bursts, statistical simulation and computation.  In much of her work she used advanced statistical methods to get the most out of available astronomical data.  She has published about 50 papers in international journals, and written four books,  including Statistical Methods for Astronomical Data Analysis written with her husband Professor Asis Kumar Chattopadhyay, published by Springer and awarded an outstanding publication award in Astrostatistics  by the International Astrostatistics Association. Tanuka successfully completed several national and international funded projects and was a frequent visitor to several leading universities in Canada, France and the USA, where she had long collaborations.  She  supervised a number of Ph.D.  students and was a respected teacher and mentor.
Tanuka was a Visiting Associate of IUCAA, Pune from 2002 until she passed away. I met her during her first visits to IUCAA, and  we soon became friends and collaborators.  We were later joined by her husband Asis Chattopadhyay,  who is a Professor at the Department of Statistics in the University of Calcutta and was a Pro-Vice Chancellor and Acting Vice Chancellor of the University.  Asis soon became a Visiting Associate of IUCAA and the three of us organised a number of workshops in Kolkata and other places on statistical methods and their application to the analysis of astronomical data.  Tanuka was a Fellow of West Bengal Academy of Science and member of several organisations including  the  International Astronomical Union (IAU), Astronomical Society of India, and the International Astrostatistics Association.
Using funds obtained under the DST PURSE programme, Tanuka obtained a 14” optical telescope.  This was installed in 2015 on the terrace of a building in Rajabazar College and was used  for sky watching, as well as carrying our research projects,  by students of the Department of Applied Mathematics and other departments of the University. She was also planning to start an outreach programme for school children in Kolkata.  Making the telescope available was an exemplary initiative from a person who was a mathematician and a theoretical astrophysicist.
In spite of her illness, which stretched over several years, Tanuka remained brave, cheerful and full of spirit, and continued to work very hard.  A few months ago, she, Dipankar Banerjee and I were speakers at a meeting celebrating the birth centenary of Professor M. K. Dasgupta.  It was impossible for me to then  believe that the dynamic speaker was gravely ill, with further treatment not really possible.  She talked to me about a visit to IUCAA in December and I was sure she would really come.  But about five weeks after the event, she had to be admitted to hospital one night, after having spent the evening attending a meeting in the Vice-Chancellor’s office.  She did not return.
Tanuka  had many interests outside astrophysics.  She wrote two Bengali novels  Bibashan and Gandhari Santoti.  She was  completing her first English novel when she passed away, which her family hopes to publish posthumously. She had an artistic nature, had formal training in music and dance and was greatly interested in  travel and culture. She was a wonderful wife, mother and mother-in-law,  a great hostess and beloved friend, and respected guide and mentor to many.  She will be greatly missed.
Ajit Kembhavi
IUCAA and Pune Knowledge Cluster

astrostat webinar [IAU-IAA]

Posted in pictures, Statistics, University life with tags , , , , , , , , , , , , , , on June 14, 2023 by xi'an

Yesterday, I gavea talk on inferring the number of components in a mixture at the international online IAU-IAA Astrostats and Astroinfo seminar. Which generated (uniformly) interesting and relevant questions for astronomical challenges. As pointed out by my Cornell friend Tom Loredo, it is unfortunately clashing with the ISI quadrenial Statistical Challenges in Modern Astronomy meeting help at Penn State.

Natural nested sampling

Posted in Books, Statistics, University life with tags , , , , , , , , , , , on May 28, 2023 by xi'an

“The nested sampling algorithm solves otherwise challenging, high-dimensional integrals by evolving a collection of live points through parameter space. The algorithm was immediately adopted in cosmology because it partially overcomes three of the major difficulties in Markov chain Monte Carlo, the algorithm traditionally used for Bayesian computation. Nested sampling simultaneously returns results for model comparison and parameter inference; successfully solves multimodal problems; and is naturally self-tuning, allowing its immediate application to new challenges.”

I came across a review on nested sampling in Nature Reviews Methods Primers of May 2022, with a large number of contributing authors, some of whom I knew from earlier papers in astrostatistics. As illustrated by the above quote from the introduction, the tone is definitely optimistic about the capacities of the method, reproducing the original argument that the evidence is the posterior expectation of the likelihood L(θ) under the prior. Which representation, while valid, is not translating into a dimension-free methodology since parameters θ still need be simulated.

“Nested sampling lies in a class of algorithms that form a path of bridging distributions and evolves samples along that path. Nested sampling stands out because the path is automatic and smooth — compression along log X by, on average, 1/𝑛at each iteration — and because along the path is compressed through constrained priors, rather than from the prior to the posterior. This was a motivation for nested sampling as it avoids phase transitions — abrupt changes in the bridging distributions — that cause problems for other methods, including path samplers, such as annealing.”

The elephant in the room is eventually processed, namely the simulation from the prior constrained to the likelihood level sets that in my experience (with, e.g., mixture posteriors) proves most time consuming. This stems from the fact that these level sets are notoriously difficult to evaluate from a given sample: all points stand within the set but they hardly provide any indication of the boundaries of saif set… Region sampling requires to construct a region that bounds the likelihood level set, which requires some knowledge of the likelihood variations to have a chance to remain efficient, incl. in cosmological applications, while regular MCMC steps require an increasing number of steps as the constraint gets tighter and tighter. For otherwise it essentially amounts to duplicating a live particle.

parallel tempering on optimised paths

Posted in Statistics with tags , , , , , , , , , , , , , , , on May 20, 2021 by xi'an


Saifuddin Syed, Vittorio Romaniello, Trevor Campbell, and Alexandre Bouchard-Côté, whom I met and discussed with on my “last” trip to UBC, on December 2019, just arXived a paper on parallel tempering (PT), making the choice of tempering path an optimisation problem. They address the touchy issue of designing a sequence of tempered targets when the starting distribution π⁰, eg the prior, and the final distribution π¹, eg the posterior, are hugely different, eg almost singular.

“…theoretical analysis of reversible variants of PT has shown that adding too many intermediate chains can actually deteriorate performance (…) [while] on non reversible regime adding more chains is guaranteed to improve performances.”

The above applies to geometric combinations of π⁰ and π¹. Which “suffers from an arbitrarily suboptimal global communication barrier“, according to the authors (although the counterexample is not completely convincing since π⁰ and π¹ share the same variance). They propose a more non-linear form of tempering with constraints on the dependence of the powers on the temperature t∈(0,1).  Defining the global communication barrier as an average over temperatures of the rejection rate, the path characteristics (e.g., the coefficients of a spline function) can then be optimised in terms of this objective. And the temperature schedule is derived from the fact that the non-asymptotic round trip rate is maximized when the rejection rates are all equal. (As a side item, the technique exposed in the earlier tempering paper by Syed et al. was recently exploited for a night high resolution imaging of a black hole from the M87 galaxy.)

dynamic nested sampling for stars

Posted in Books, pictures, Statistics, Travel with tags , , , , , , , , , , , , , , , , , on April 12, 2019 by xi'an

In the sequel of earlier nested sampling packages, like MultiNest, Joshua Speagle has written a new package called dynesty that manages dynamic nested sampling, primarily intended for astronomical applications. Which is the field where nested sampling is the most popular. One of the first remarks in the paper is that nested sampling can be more easily implemented by using a Uniform reparameterisation of the prior, that is, a reparameterisation that turns the prior into a Uniform over the unit hypercube. Which means in fine that the prior distribution can be generated from a fixed vector of uniforms and known transforms. Maybe not such an issue given that this is the prior after all.  The author considers this makes sampling under the likelihood constraint a much simpler problem but it all depends in the end on the concentration of the likelihood within the unit hypercube. And on the ability to reach the higher likelihood slices. I did not see any special trick when looking at the documentation, but reflected on the fundamental connection between nested sampling and this ability. As in the original proposal by John Skilling (2006), the slice volumes are “estimated” by simulated Beta order statistics, with no connection with the actual sequence of simulation or the problem at hand. We did point out our incomprehension for such a scheme in our Biometrika paper with Nicolas Chopin. As in earlier versions, the algorithm attempts at visualising the slices by different bounding techniques, before proceeding to explore the bounded regions by several exploration algorithms, including HMC.

“As with any sampling method, we strongly advocate that Nested Sampling should not be viewed as being strictly“better” or “worse” than MCMC, but rather as a tool that can be more or less useful in certain problems. There is no “One True Method to Rule Them All”, even though it can be tempting to look for one.”

When introducing the dynamic version, the author lists three drawbacks for the static (original) version. One is the reliance on this transform of a Uniform vector over an hypercube. Another one is that the overall runtime is highly sensitive to the choice the prior. (If simulating from the prior rather than an importance function, as suggested in our paper.) A third one is the issue that nested sampling is impervious to the final goal, evidence approximation versus posterior simulation, i.e., uses a constant rate of prior integration. The dynamic version simply modifies the number of point simulated in each slice. According to the (relative) increase in evidence provided by the current slice, estimated through iterations. This makes nested sampling a sort of inversted Wang-Landau since it sharpens the difference between slices. (The dynamic aspects for estimating the volumes of the slices and the stopping rule may hinder convergence in unclear ways, which is not discussed by the paper.) Among the many examples produced in the paper, a 200 dimension Normal target, which is an interesting object for posterior simulation in that most of the posterior mass rests on a ring away from the maximum of the likelihood. But does not seem to merit a mention in the discussion. Another example of heterogeneous regression favourably compares dynesty with MCMC in terms of ESS (but fails to include an HMC version).

[Breaking News: Although I wrote this post before the exciting first image of the black hole in M87 was made public and hence before I was aware of it, the associated AJL paper points out relying on dynesty for comparing several physical models of the phenomenon by nested sampling.]