Archive for partition

Le Monde puzzle [#1033]

Posted in Books, Kids, R with tags , , , , on December 19, 2017 by xi'an

lemondapariA simple Le Monde mathematical puzzle after two geometric ones I did not consider:

  1. Bob gets a 2×3 card with three integer entries on the first row and two integer entries on the second row such that (i) entry (1,1) is 1, (ii) summing up subsets of adjacent entries produces all integers from 1 to 21. (Adjacent means sharing an index.) Deduce Bob’s voucher.
  2.  Alice gets Bob’s voucher completed into a 2×4 card with further integer entries. What is the largest value of N such that all integers from 1 to N are available through summing up all subsets of entries?

The first question only requires a few attempts but it can be solves by brute force simulation. Here is a R code that leads to the solution:

  sum(sol[c(1,4)]), sum(sol[c(1,5)]),sum(sol[1:3]),
  sum(sol[c(2,4,5)]), sum(sol[c(1,2,3,5)]),sum(sol[2:5]), 

produces (1,8,7,3,2) as the only case for which


The second puzzle means considering all sums and checking there exists a solution for all subsets. There is no constraint in this second question, hence on principle this could produce N=2⁸-1=255, but I have been unable to exceed 175 through brute force simulation. (Which entitled me to use the as.logical(intToBits(i) R command!)

CRiSM workshop on estimating constants [slides]

Posted in Books, pictures, Statistics, Travel, University life with tags , , , , , , , , on May 4, 2016 by xi'an

A short announcement that the slides of almost all talks at the CRiSM workshop on estimating constants last April 20-22 are now available. Enjoy (and dicuss)!

CRiSM workshop on estimating constants [#2]

Posted in pictures, Statistics, Travel, University life, Wines with tags , , , , , , , , , on March 31, 2016 by xi'an

The schedule for the CRiSM workshop on estimating constants that Nial Friel, Helen Ogden and myself host next April 20-22 at the University of Warwick is now set as follows. (The plain registration fees are £40 and accommodation on the campus is available through the online form.)

April 20, 2016
11:45 — 12:30: Adam Johansen
12:30 — 14:00: Lunch
14:00 — 14:45: Anne-Marie Lyne
14:45 — 15:30: Pierre Jacob
15:30 — 16:00: Break
16:00 — 16:45: Roberto Trotta
17:00 — 18:00: ‘Elevator’ talks
18:00 — 20:00: Poster session, Cheese and wine

April 21, 2016
9:00 — 9:45: Michael Betancourt
9:45 — 10:30: Nicolas Chopin
10:30 — 11:00: Coffee break
11:00 — 11:45: Merrilee Hurn
11:45 — 12:30: Jean-Michel Marin
12:30 — 14:00: Lunch
14:00 — 14:45: Sumit Mukherjee
14:45 — 15:30: Yves Atchadé
15:30 — 16:00: Break
16:00 — 16:45: Michael Gutmann
16:45 — 17:30: Panayiota Touloupou
19:00 — 22:00: Dinner

April 22, 2016
9:00 — 9:45: Chris Sherlock
9:45 — 10:30: Christophe Andrieu
10:30 — 11:00: Coffee break
11:00 — 11:45: Antonietta Mira

CRiSM workshop on estimating constants [#1]

Posted in pictures, Statistics, Travel, University life with tags , , , , , , , , , on January 19, 2016 by xi'an

The registration for the CRiSM workshop on estimating constants that Nial Friel, Helen Ogden and myself host next April 20-22 at the University of Warwick is now open. The plain registration fees are £40 and accommodation on the campus is available through the same form.

Since besides the invited talks, the workshop will host two poster session with speed (2-5mn) oral presentations, we encourage all interested researchers to submit a poster via the appropriate form. Once again, this should be an exciting two-day workshop, given the on-going activity in this area.

on estimating constants…

Posted in pictures, Statistics, Travel, University life with tags , , , , , , , , , on July 21, 2015 by xi'an

While I discussed on the ‘Og in the past the difference I saw between estimating an unknown parameter from a distribution and evaluating a normalising constant, evaluating such constants and hence handling [properly] doubly intractable models is obviously of the utmost importance! For this reason, Nial Friel, Helen Ogden and myself have put together a CRiSM workshop on the topic (with the tongue-in-cheek title of Estimating constants!), to be held at the University of Warwick next April 20-22.

The CRiSM workshop will focus on computational methods for approximating challenging normalising constants found in Monte Carlo, likelihood and Bayesian models. Such methods may be used in a wide range of problems: to compute intractable likelihoods, to find the evidence in Bayesian model selection, and to compute the partition function in Physics. The meeting will bring together different communities working on these related problems, some of which have developed original if little advertised solutions. It will also highlight the novel challenges associated with large data and highly complex models. Besides a dozen invited talks, the schedule will highlight two afternoon poster sessions with speed (2-5mn) oral presentations called ‘Elevator’ talks.

While 2016 is going to be quite busy with all kinds of meetings (MCMSkv, ISBA 2016, the CIRM Statistics month, AISTATS 2016, …), this should be an exciting two-day workshop, given the on-going activity in this area, and I thus suggest interested readers to mark the dates in their diary. I will obviously keep you posted about registration and accommodation when those entries are available.

[more] parallel MCMC

Posted in Books, Mountains with tags , , , , , , , , , , on April 3, 2014 by xi'an

Scott Schmidler and his Ph.D. student Douglas VanDerwerken have arXived a paper on parallel MCMC the very day I left for Chamonix, prior to MCMSki IV, so it is no wonder I missed it at the time. This work is somewhat in the spirit of the parallel papers Scott et al.’s consensus Bayes,  Neiswanger et al.’s embarrassingly parallel MCMC, Wang and Dunson’s Weierstrassed MCMC (and even White et al.’s parallel ABC), namely that the computation of the likelihood can be broken into batches and MCMC run over those batches independently. In their short survey of previous works on parallelization, VanDerwerken and Schmidler overlooked our neat (!) JCGS Rao-Blackwellisation with Pierre Jacob and Murray Smith, maybe because it sounds more like post-processing than genuine parallelization (in that it does not speed up the convergence of the chain but rather improves the Monte Carlo usages one can make of this chain), maybe because they did not know of it.

“This approach has two shortcomings: first, it requires a number of independent simulations, and thus processors, equal to the size of the partition; this may grow exponentially in dim(Θ). Second, the rejection often needed for the restriction doesn’t permit easy evaluation of transition kernel densities, required below. In addition, estimating the relative weights wi with which they should be combined requires care.” (p.3)

The idea of the authors is to replace an exploration of the whole space operated via a single Markov chain (or by parallel chains acting independently which all have to “converge”) with parallel and independent explorations of parts of the space by separate Markov chains. “Small is beautiful”: it takes a shorter while to explore each set of the partition, hence to converge, and, more importantly, each chain can work in parallel to the others. More specifically, given a partition of the space, into sets Ai with posterior weights wi, parallel chains are associated with targets equal to the original target restricted to those Ai‘s. This is therefore an MCMC version of partitioned sampling. With regard to the shortcomings listed in the quote above, the authors consider that there does not need to be a bijection between the partition sets and the chains, in that a chain can move across partitions and thus contribute to several integral evaluations simultaneously. I am a bit worried about this argument since it amounts to getting a random number of simulations within each partition set Ai. In my (maybe biased) perception of partitioned sampling, this sounds somewhat counter-productive, as it increases the variance of the overall estimator. (Of course, not restricting a chain to a given partition set Ai has the incentive of avoiding a possibly massive amount of rejection steps. It is however unclear (a) whether or not it impacts ergodicity (it all depends on the way the chain is constructed, i.e. against which target(s)…) as it could lead to an over-representation of some boundaries and (b) whether or not it improves the overall convergence properties of the chain(s).)

“The approach presented here represents a solution to this problem which can completely remove the waiting times for crossing between modes, leaving only the relatively short within-mode equilibration times.” (p.4)

A more delicate issue with the partitioned MCMC approach (in my opinion!) stands with the partitioning. Indeed, in a complex and high-dimension model, the construction of the appropriate partition is a challenge in itself as we often have no prior idea where the modal areas are. Waiting for a correct exploration of the modes is indeed faster than waiting for crossing between modes, provided all modes are represented and the chain for each partition set Ai has enough energy to explore this set. It actually sounds (slightly?) unlikely that a target with huge gaps between modes will see a considerable improvement from the partioned version when the partition sets Ai are selected on the go, because some of the boundaries between the partition sets may be hard to reach with a off-the-shelf proposal. (Obviously, the second part of the method on the adaptive construction of partitions is yet in the writing and I am looking forward its aXival!)

Furthermore, as noted by Pierre Jacob (of Statisfaction fame!), the adaptive construction of the partition has a lot in common with Wang-Landau schemes. Which goal is to produce a flat histogram proposal from the current exploration of the state space. Connections with Atchadé’s and Liu’s (2010, Statistical Sinica) extension of the original Wang-Landau algorithm could have been spelled out. Esp. as the Voronoï tessellation construct seems quite innovative in this respect.

Le Monde puzzle [#822]

Posted in Books, Kids, R with tags , , , , , , on June 10, 2013 by xi'an

For once Le Monde math puzzle is much more easily solved on a piece of paper than in R, even in a plane from Roma:

Given a partition of the set {1,…,N} in k groups, one considers the collection of all subsets of  the set {1,…,N} containing at least one element from each group. Show that the size of the collection cannot be 50.

Obviously, one could consider a range of possible N’s and k’s and run a program evaluating the sizes of the corresponding collections. However, if the k groups are of size n1,…,nk, the number of subsets satisfying the condition is

(2^{n_1}-1)\times \ldots \times (2^{n_k}-1)

and it is easily shown by induction that this number is necessarily odd, hence the impossible 50.