## the Flatland paradox

**P**ierre Druilhet arXived a note a few days ago about the Flatland paradox (due to Stone, 1976) and his arguments against the flat prior. The paradox in this highly artificial setting is as follows: Consider a sequence θ of N independent draws from {a,b,1/a,1/b} such that

- N and θ are unknown;
- a draw followed by its inverse and this inverse are removed from θ;
- the successor
*x*of θ is observed, meaning an extra draw is made and the above rule applied.

Then the frequentist probability that *x* is longer than θ given θ is at least 3/4—*at least* because θ could be zero—while the posterior probability that *x* is longer than θ given x is 1/4 under the flat prior over θ. Paradox that 3/4 and 1/4 clash. Not so much of a paradox because there is no joint probability distribution over (x,θ).

The paradox was actually discussed at length in Larry Wasserman’s now defunct Normal Variate. From which I borrowed Larry’s graphical representation of the four possible values of θ given the (green) endpoint of *x*. Larry uses the Flatland paradox hammer to fix another nail on the coffin he contemplates for improper priors. And all things Bayes. Pierre (like others before him) argues against the flat prior on θ and shows that a flat prior on the length of θ leads to recover 3/4 as the posterior probability that *x* is longer than θ.

As I was reading the paper in the métro yesterday morning, I became less and less satisfied with the whole analysis of the problem in that I could not perceive θ as a *parameter* of the model. While this may sound a pedantic distinction, θ is a *latent variable* (or a *random effect*) associated with *x* in a model where the only unknown parameter is N, the total number of draws used to produce θ and *x*. The distributions of both θ and *x* are entirely determined by N. (In that sense, the flatland paradox can be seen as a marginalisation paradox in that an improper prior on N cannot be interpreted as projecting a prior on θ.) Given N, the distribution of *x* of length *l(x)* is then 1/4^{N }times the number of ways of picking (N-*l(x)*) annihilation steps among N. Using a prior on N like 1/N , which is improper, then leads to favour the shortest path as well. (After discussing the issue with Pierre Druilhet, I realised he had a similar perspective on the issue. Except that he puts a flat prior on the length *l(x)*.) Looking a wee bit further for references, I also found that Bruce Hill had adopted the same perspective of a prior on N.

January 16, 2016 at 7:04 am

[…] https://…/the-flatland-paradox … Flatland Paradox […]

June 5, 2015 at 3:40 am

Very interesting. It reminds me of the Monty Hall paradox, because something that seems like it shouldn’t affect a probability (for Monty Hall, opening a curtain, for Flatland, the final state) has a huge effect. In Monty Hall, you should switch because you probably picked wrong at first. In Flatland, you should backtrack to find the treasure because you probably didn’t just backtrack – going, burying the treasure, then backtracking is less likely than just having gone without backtracking. It took me a long time to understand Monty Hall, and a long time to understand this too. In both cases, running simulations helped me understand.

June 5, 2015 at 1:44 pm

Thanks! I feel it is rather different from the Monty Hall paradox in that the issue here is with the prior distribution, rather than adopting an optimal strategy.

June 5, 2015 at 7:52 pm

I see, thanks for clarifying. I guess there are several ways to think of both: Flatland as described at normaldeviate.wordpress.com talks about the best place to look for a treasure, so in that way it is also about optimal strategy. And Monty Hall is about updating a prior belief about where the prize is, so (in my mind at least) they are both about priors and also both about optimal strategies.

June 2, 2015 at 3:02 pm

[…] even for a simple problem with a discrete parameters space, flat priors can lead to surprises. https://…/the-flatland-paradox … Stone’s […]