## unusual clouds [jatp]

Posted in pictures, Travel, Wines with tags , , , , , , , , , , on July 19, 2018 by xi'an

## subset sampling

Posted in Statistics with tags , , , , , , , , on July 13, 2018 by xi'an

A paper by Au and Beck (2001) was mentioned during a talk at MCqMC 2018 in Rennes and I checked Probabilistic Engineering Mechanics for details. There is no clear indication that the subset simulation advocated therein is particularly effective. The core idea is to obtain the probability to belong to a small set A by a cascading formula, namely the product of the probability to belong to A¹, then the conditional probability to belong to A² given A¹, &tc. When the subsets A¹, A², …, A constitute a decreasing embedded sequence. The simulation conditional on being in one of the subsets $A^i$ is operated by a random-walk Metropolis-within-Gibbs scheme, with an additional rejection when the value is not in the said subset. (Surprisingly, the authors re-establish the validity of this scheme.) Hence the proposal faces similar issues as nested sampling, except that the nested subsets here are defined quite differently as they are essentially free, provided they can be easily evaluated. Each of the random walks need be scaled, the harder a task because this depends on the corresponding subset volume. The subsets $A^i$ themselves are rarely defined in a natural manner, except when being tail events. And need to be calibrated so that the conditional probability of falling into each remains large enough, the cost of free choice. The Markov chain on the previous subset $A^i$ can prove useful to build the next subset $A^{i+1}$, but there is no general principle behind this remark. (If any, this is connected with X entropy.) But else, the past chains are very much wasted, compared with, say, an SMC treatment of the problem. The paper also notices that starting a Markov chain in the set $A^{i+1}$ means there is no burnin time and hence that the probability estimators are thus unbiased. (This creates a correlation between successive Markov chains, but I think it could be ignored if the starting point was chosen at random or after a random number of extra steps.) The authors further point out that the chain may fail to be ergodic, if the proposal distribution lacks energy to link connected regions of the current subset $A^i$. They suggest using multiple chains with multiple starting points, which alleviates the issue only to some extent, as it ultimately depends on the spread of the starting points. As acknowledged in the paper.

## a thread to bin them all [puzzle]

Posted in Books, Kids, R, Travel with tags , , , , , , , , on July 9, 2018 by xi'an

The most recent riddle on the Riddler consists in finding the shorter sequence of digits (in 0,1,..,9) such that all 10⁴ numbers between 0 (or 0000) and 9,999 can be found as a group of consecutive four digits. This sequence is obviously longer than 10⁴+3, but how long? On my trip to Brittany last weekend, I wrote an R code first constructing the sequence at random by picking with high preference the next digit among those producing a new four-digit number

```tenz=10^(0:3)
wn2dg=function(dz) 1+sum(dz*tenz)

seqz=rep(0,10^4)
snak=wndz=sample(0:9,4,rep=TRUE)
seqz[wn2dg(wndz)]=1
while (min(seqz)==0){
wndz[1:3]=wndz[-1];wndz[4]=0
wndz[4]=sample(0:9,1,prob=.01+.99*(seqz[wn2dg(wndz)+0:9]==0))
snak=c(snak,wndz[4])
sek=wn2dg(wndz)
seqz[sek]=seqz[sek]+1}
```

which usually returns a value above 75,000. I then looked through the sequence to eliminate useless replicas

```for (i in sample(4:(length(snak)-5))){
if ((seqz[wn2dg(snak[(i-3):i])]>1)
&(seqz[wn2dg(snak[(i-2):(i+1)])]>1)
&(seqz[wn2dg(snak[(i-1):(i+2)])]>1)
&(seqz[wn2dg(snak[i:(i+3)])]>1)){
seqz[wn2dg(snak[(i-3):i])]=seqz[wn2dg(snak[(i-3):i])]-1
seqz[wn2dg(snak[(i-2):(i+1)])]=seqz[wn2dg(snak[(i-2):(i+1)])]-1
seqz[wn2dg(snak[(i-1):(i+2)])]=seqz[wn2dg(snak[(i-1):(i+2)])]-1
seqz[wn2dg(snak[i:(i+3)])]=seqz[wn2dg(snak[i:(i+3)])]-1
snak=snak[-i]
seqz[wn2dg(snak[(i-3):i])]=seqz[wn2dg(snak[(i-3):i])]+1
seqz[wn2dg(snak[(i-2):(i+1)])]=seqz[wn2dg(snak[(i-2):(i+1)])]+1
seqz[wn2dg(snak[(i-1):(i+2)])]=seqz[wn2dg(snak[(i-1):(i+2)])]+1}}
```

until none is found. A first attempt produced 12,911 terms in the sequence. A second one 12,913. A third one 12,871. Rather consistent figures but not concentrated enough to believe in achieving a true minimum. An overnight run produced 12,779 as the lowest value. Checking the answer the week after, it appears that 10⁴+3 is the correct answer!

## Rennes street art #2 [jatp]

Posted in Kids, pictures, Travel, University life with tags , , , , , , , , , on July 7, 2018 by xi'an

## MCqMC 2018, Rennes [slides]

Posted in Statistics with tags , , , , , on July 3, 2018 by xi'an

Here are my slides for the talk I give this morning at MCqMC 20188. Based on slides first written by Changye Wu and on our joint papers. As it happens, I was under the impression I would give a survey on partially deterministic Markov processes. But, as it goes (!), my talk takes place after a superb plenary talk by Christophe Andrieu on non-reversibility, where he gave motivations for recoursing to non-reversibility and general results for variance reduction, plus a whole session on the topic by Jorens Bierkens, Alex Thiéry, Alain Durmus, and Arnak Dalalyan (CREST), which covered the topics in the following slides, only better! Reducing the informative contents of my talk to the alternative to the Zig-Zag sampler Changye proposed, which makes the talk of limited appeal, I am afraid. (There are four other sessions at the same time, fortunately!)

## Rennes street art [jatp]

Posted in Kids, pictures, Travel, University life with tags , , , , , , , , , on July 2, 2018 by xi'an

## yet another opportunity in a summer of Briton conferences, free of charge!

Posted in Statistics with tags , , , , , , , on April 10, 2018 by xi'an