Archive for point processes

lords of the rings

Posted in Books, pictures, Statistics, University life with tags , , , , , , on February 9, 2017 by xi'an

In the 19 Jan 2017 issue of Nature [that I received two weeks later], a paper by Tarnita et al discusses regular vegetation patterns like fairy patterns. While this would seem like an ideal setting for point process modelling, the article does not seem to get into that direction, debating instead between ecological models. Which combines vegetal self-organisation, with subterranean insect competition. Since the paper seems to derive validation of a model by simulation means without producing a single equation, I went and checked the supplementary material attached to this paper. What I gathered from this material is that the system of differential equations used to build this model seems to be extrapolated by seeking parameter values consistent with what is known” rather than estimated as in a statistical model. Given the extreme complexity of the resulting five page model, I am surprised at the low level of validation of the construct, with no visible proof of stationarity of the (stochastic) model thus constructed, and no model assessment in a statistical sense. Of course, a major disclaimer applies: (a) this area does not even border my domains of (relative) expertise and (b) I have not spent much time perusing over the published paper and the attached supplementary material. (Note: This issue of Nature also contains a fascinating review paper by Nielsen et al. on a detailed scenario of human evolutionary history, based on the sequencing of genomes of extinct hominids.)

ABC for repulsive point processes

Posted in Books, pictures, Statistics, University life with tags , , , , , , , on May 5, 2016 by xi'an

garden tree, Jan. 12, 2012Shinichiro Shirota and Alan Gelfand arXived a paper on the use of ABC for analysing some repulsive point processes, more exactly the Gibbs point processes, for which ABC requires a perfect sampler to operate, unless one is okay with stopping an MCMC chain before it converges, and determinantal point processes studied by Lavancier et al. (2015) [a paper I wanted to review and could not find time to!]. Detrimental point processes have an intensity function that is the determinant of a covariance kernel, hence repulsive. Simulation of a determinantal process itself is not straightforward and involves approximations. But the likelihood itself is unavailable and Lavancier et al. (2015) use approximate versions by fast Fourier transforms, which means MCMC is challenging even with those approximate steps.

“The main computational cost of our algorithm is simulation of x for each iteration of the ABC-MCMC.”

The authors propose here to use ABC instead. With an extra approximative step for simulating the determinantal process itself. Interestingly, the Gibbs point process allows for a sufficient statistic, the number of R-closed points, although I fail to see how the radius R is determined by the model, while the determinantal process does not. The summary statistics end up being a collection of frequencies within various spheres of different radii. However, these statistics are then processed by Fearnhead’s and Prangle’s proposal, namely to come up as an approximation of E[θ|y] as the natural summary. Obtained by regression over the original summaries. Another layer of complexity stems from using an ABC-MCMC approach. And including a Lasso step in the regression towards excluding less relevant radii. The paper also considers Bayesian model validation for such point processes, implementing prior predictive tests with a ranked probability score, rather than a Bayes factor.

As point processes have always been somewhat mysterious to me, I do not have any intuition about the strength of the distributional assumptions there and the relevance of picking a determinantal process against, say, a Strauss process. The model comparisons operated in the paper are not strongly supporting one repulsive model versus the others, with the authors concluding at the need for many points towards a discrimination between models. I also wonder at the possibility of including other summaries than Ripley’s K-functions, which somewhat imply a discretisation of the space, by concentric rings. Maybe using other point processes for deriving summary statistics as MLEs or Bayes estimators for those models would help. (Or maybe not.)

never mind the big data here’s the big models [workshop]

Posted in Kids, pictures, Statistics, Travel, University life with tags , , , , , , , , , , , , , , , on December 22, 2015 by xi'an

Maybe the last occurrence this year of the pastiche of the iconic LP of the Sex Pistols!, made by Tamara Polajnar. The last workshop as well of the big data year in Warwick, organised by the Warwick Data Science Institute. I appreciated the different talks this afternoon, but enjoyed particularly Dan Simpson’s and Rob Scheichl’s. The presentation by Dan was so hilarious that I could not resist asking him for permission to post the slides here:

Not only hilarious [and I have certainly missed 67% of the jokes], but quite deep about the meaning(s) of modelling and his views about getting around the most blatant issues. Ron presented a more computational talk on the ways to reach petaflops on current supercomputers, in connection with weather prediction models used (or soon to be used) by the Met office. For a prediction area of 1 km². Along with significant improvements resulting from multiscale Monte Carlo and quasi-Monte Carlo. Definitely impressive! And a brilliant conclusion to the Year of Big Data (and big models).

point process-based Monte Carlo

Posted in Books, Kids, Statistics, University life with tags , , , , , on December 3, 2015 by xi'an

Clément Walter from Paris just pointed me to an arXived paper he had very recently gotten accepted for publication in Statistics and Computing. (Congrats!) Because his paper relates to nested sampling. And connects it with rare event simulation via interacting particle systems. And multilevel Monte Carlo. I had missed it when it came out on arXiv last December [as the title was unrelated with nested sampling if not Monte Carlo], but the paper brings fairly interesting new results about an ideal version of nested sampling that is

  1. unbiased when using an infinite number of terms;
  2. always better than the standard Monte Carlo estimator, variance-wise;
  3. connected with an implicit marked Poisson process; and
  4. enjoying a finite variance provided the quantity of interest has an 1+ε moment.

Of course, such results only hold for an ideal version and do not address the issue of the conditional simulations required by nested sampling. (Which has an impact on the computing time as the conditional simulation becomes more and more expensive as the likelihood value increases.) The explanation therein of the approximation of tail probabilities by a Poisson estimate makes the link with deterministic nested sampling much clearer to me. Point 2 above means that the nested sampling estimate always does better than the average of the likelihood values produced by an iid or MCMC simulation from the prior distribution. The paper also borrows from the debiasing approach of Rhee and Glynn (already used by the Russian roulette) to turn truncated versions of the nested sampling estimator into an unbiased estimator, with a limited impact on the variance of the estimator. Truncation is associated with the generation of a geometric stopping time which parameter needs to be optimised. Without a more detailed reading, I am somewhat lost as to this optimisation remains feasible in complex settings… The paper contains an illustration for a Pareto distribution where optimisation and calibration can be conducted quite far. It also re-analyses the Mexican hat example of Skilling (2006), showing that our stopping rule may induce bias.

postdoc in the Alps

Posted in Kids, Mountains, Statistics, Travel, University life with tags , , , , , , , , , on May 22, 2015 by xi'an

Post-doctoral Position in Spatial/Computational Statistics (Grenoble, France)

A post-doctoral position is available in Grenoble, France, to work on computational methods for spatial point process models. The candidate will work with Simon Barthelmé (GIPSA-lab, CNRS) and Jean-François Coeurjolly (Univ. Grenoble Alpes, Laboratory Jean Kuntzmann) on extending point process methodology to deal with large datasets involving multiple sources of variation. We will focus on eye movement data, a new and exciting application area for spatial statistics. The work will take place in the context of an interdisciplinary project on eye movement modelling involving psychologists, statisticians and applied mathematicians from three different institutes in Grenoble.

The ideal candidate has a background in spatial or computational statistics or machine learning. Knowledge of R (and in particular the package spatstat) and previous experience with point process models is a definite plus.

The duration of the contract is 12+6 months, starting 01.10.2015 at the earliest. Salary is according to standard CNRS scale (roughly EUR 2k/month).

Grenoble is the largest city in the French Alps, with a very strong science and technology cluster. It is a pleasant place to live, in an exceptional mountain environment.