Archive for point processes

lords of the rings

Posted in Books, pictures, Statistics, University life with tags , , , , , , on February 9, 2017 by xi'an

In the 19 Jan 2017 issue of Nature [that I received two weeks later], a paper by Tarnita et al discusses regular vegetation patterns like fairy patterns. While this would seem like an ideal setting for point process modelling, the article does not seem to get into that direction, debating instead between ecological models. Which combines vegetal self-organisation, with subterranean insect competition. Since the paper seems to derive validation of a model by simulation means without producing a single equation, I went and checked the supplementary material attached to this paper. What I gathered from this material is that the system of differential equations used to build this model seems to be extrapolated by seeking parameter values consistent with what is known” rather than estimated as in a statistical model. Given the extreme complexity of the resulting five page model, I am surprised at the low level of validation of the construct, with no visible proof of stationarity of the (stochastic) model thus constructed, and no model assessment in a statistical sense. Of course, a major disclaimer applies: (a) this area does not even border my domains of (relative) expertise and (b) I have not spent much time perusing over the published paper and the attached supplementary material. (Note: This issue of Nature also contains a fascinating review paper by Nielsen et al. on a detailed scenario of human evolutionary history, based on the sequencing of genomes of extinct hominids.)

ABC for repulsive point processes

Posted in Books, pictures, Statistics, University life with tags , , , , , , , on May 5, 2016 by xi'an

garden tree, Jan. 12, 2012Shinichiro Shirota and Alan Gelfand arXived a paper on the use of ABC for analysing some repulsive point processes, more exactly the Gibbs point processes, for which ABC requires a perfect sampler to operate, unless one is okay with stopping an MCMC chain before it converges, and determinantal point processes studied by Lavancier et al. (2015) [a paper I wanted to review and could not find time to!]. Detrimental point processes have an intensity function that is the determinant of a covariance kernel, hence repulsive. Simulation of a determinantal process itself is not straightforward and involves approximations. But the likelihood itself is unavailable and Lavancier et al. (2015) use approximate versions by fast Fourier transforms, which means MCMC is challenging even with those approximate steps.

“The main computational cost of our algorithm is simulation of x for each iteration of the ABC-MCMC.”

The authors propose here to use ABC instead. With an extra approximative step for simulating the determinantal process itself. Interestingly, the Gibbs point process allows for a sufficient statistic, the number of R-closed points, although I fail to see how the radius R is determined by the model, while the determinantal process does not. The summary statistics end up being a collection of frequencies within various spheres of different radii. However, these statistics are then processed by Fearnhead’s and Prangle’s proposal, namely to come up as an approximation of E[θ|y] as the natural summary. Obtained by regression over the original summaries. Another layer of complexity stems from using an ABC-MCMC approach. And including a Lasso step in the regression towards excluding less relevant radii. The paper also considers Bayesian model validation for such point processes, implementing prior predictive tests with a ranked probability score, rather than a Bayes factor.

As point processes have always been somewhat mysterious to me, I do not have any intuition about the strength of the distributional assumptions there and the relevance of picking a determinantal process against, say, a Strauss process. The model comparisons operated in the paper are not strongly supporting one repulsive model versus the others, with the authors concluding at the need for many points towards a discrimination between models. I also wonder at the possibility of including other summaries than Ripley’s K-functions, which somewhat imply a discretisation of the space, by concentric rings. Maybe using other point processes for deriving summary statistics as MLEs or Bayes estimators for those models would help. (Or maybe not.)

never mind the big data here’s the big models [workshop]

Posted in Kids, pictures, Statistics, Travel, University life with tags , , , , , , , , , , , , , , , on December 22, 2015 by xi'an

Maybe the last occurrence this year of the pastiche of the iconic LP of the Sex Pistols!, made by Tamara Polajnar. The last workshop as well of the big data year in Warwick, organised by the Warwick Data Science Institute. I appreciated the different talks this afternoon, but enjoyed particularly Dan Simpson’s and Rob Scheichl’s. The presentation by Dan was so hilarious that I could not resist asking him for permission to post the slides here:

Not only hilarious [and I have certainly missed 67% of the jokes], but quite deep about the meaning(s) of modelling and his views about getting around the most blatant issues. Ron presented a more computational talk on the ways to reach petaflops on current supercomputers, in connection with weather prediction models used (or soon to be used) by the Met office. For a prediction area of 1 km². Along with significant improvements resulting from multiscale Monte Carlo and quasi-Monte Carlo. Definitely impressive! And a brilliant conclusion to the Year of Big Data (and big models).

point process-based Monte Carlo

Posted in Books, Kids, Statistics, University life with tags , , , , , on December 3, 2015 by xi'an

Clément Walter from Paris just pointed me to an arXived paper he had very recently gotten accepted for publication in Statistics and Computing. (Congrats!) Because his paper relates to nested sampling. And connects it with rare event simulation via interacting particle systems. And multilevel Monte Carlo. I had missed it when it came out on arXiv last December [as the title was unrelated with nested sampling if not Monte Carlo], but the paper brings fairly interesting new results about an ideal version of nested sampling that is

  1. unbiased when using an infinite number of terms;
  2. always better than the standard Monte Carlo estimator, variance-wise;
  3. connected with an implicit marked Poisson process; and
  4. enjoying a finite variance provided the quantity of interest has an 1+ε moment.

Of course, such results only hold for an ideal version and do not address the issue of the conditional simulations required by nested sampling. (Which has an impact on the computing time as the conditional simulation becomes more and more expensive as the likelihood value increases.) The explanation therein of the approximation of tail probabilities by a Poisson estimate makes the link with deterministic nested sampling much clearer to me. Point 2 above means that the nested sampling estimate always does better than the average of the likelihood values produced by an iid or MCMC simulation from the prior distribution. The paper also borrows from the debiasing approach of Rhee and Glynn (already used by the Russian roulette) to turn truncated versions of the nested sampling estimator into an unbiased estimator, with a limited impact on the variance of the estimator. Truncation is associated with the generation of a geometric stopping time which parameter needs to be optimised. Without a more detailed reading, I am somewhat lost as to this optimisation remains feasible in complex settings… The paper contains an illustration for a Pareto distribution where optimisation and calibration can be conducted quite far. It also re-analyses the Mexican hat example of Skilling (2006), showing that our stopping rule may induce bias.

postdoc in the Alps

Posted in Kids, Mountains, Statistics, Travel, University life with tags , , , , , , , , , on May 22, 2015 by xi'an

Post-doctoral Position in Spatial/Computational Statistics (Grenoble, France)

A post-doctoral position is available in Grenoble, France, to work on computational methods for spatial point process models. The candidate will work with Simon Barthelmé (GIPSA-lab, CNRS) and Jean-François Coeurjolly (Univ. Grenoble Alpes, Laboratory Jean Kuntzmann) on extending point process methodology to deal with large datasets involving multiple sources of variation. We will focus on eye movement data, a new and exciting application area for spatial statistics. The work will take place in the context of an interdisciplinary project on eye movement modelling involving psychologists, statisticians and applied mathematicians from three different institutes in Grenoble.

The ideal candidate has a background in spatial or computational statistics or machine learning. Knowledge of R (and in particular the package spatstat) and previous experience with point process models is a definite plus.

The duration of the contract is 12+6 months, starting 01.10.2015 at the earliest. Salary is according to standard CNRS scale (roughly EUR 2k/month).

Grenoble is the largest city in the French Alps, with a very strong science and technology cluster. It is a pleasant place to live, in an exceptional mountain environment.

Latent Gaussian Models im Zürich [day 2]

Posted in pictures, R, Statistics, Travel, University life with tags , , , , , , , on February 7, 2011 by xi'an

The second day at the Latent Gaussian Models workshop in Zürich was equally interesting. Among the morning talks, let me mention Daniel Bové who gave a talk connected with the hyper-g prior paper he wrote with Leo Held (commented in an earlier post) and the duo of Janine Illian and Daniel Simpson who gave enthusiastic arguments as to why point pattern datasets should be analysed in a completely novel way, using partial SDEs. And showed us how this could be done via INLA. This perspective (purposedly?) contrasted with the modelling assumptions of Alan Gelfand who concluded the meeting with a highly interesting modelling/estimation of species distribution in the Cape area. He also ran a comparison with the Maxent approach to the same problem. As for my own talk, I somehow spent too much time on the introduction to ABC, trying to link the method with non-parametric perspectives, and so ended rushing through the sufficiency part and the population genetic results obtained by Jean-Marie Cornuet and Jean-Michel Marin the previous day. (The updated slides are available on slideshare.) I hope the main message was still spelled out clearly enough… In conclusion, this was a very interesting workshop, maybe the first of a kind since there is a possible follow-up next year in Trondheim. It showed the clear emergence of a very active INLA community, able to tackle old and new problems using this new technology, and illustrated once again the importance of developing user-friendly codes for promoting such technologies.

Seminar im Heidelberg

Posted in Books, pictures, Running, Statistics, Travel with tags , , , , on April 18, 2010 by xi'an

Even though I can report only today for unexpected family issues (and not at all for the Icelandic volcano ashes!), I have had a wonderful trip to Heidelberg! Both seminars were packed with students, many more than faculty, I met a PhD student working on ABC who travelled all the way from Bonn to attend the seminars and discuss with me, I had highly informative statistics discussions with the local faculty (about ABC, point processes, particle filters, graphical models, stochastic volatility, covariance estimation, and more), plus great runs, up the local hills, a few glimpses of the old city, and good local beer and food! Had the internet connection worked (better), I would have been a bit more efficient, but this led me to take the philosophy way for a few days. and progress in my evaliation of Evidence and Evolution. Thanks to Tilmann Gneiting for the invitation and for the warm welcome!