**I** came over the weekend across this graph and the associated news that the county of Saint-Nazaire, on the southern border of Brittany, had a significantly higher rate of cancers than the Loire countries. The complete study written by Solenne Delacour, Anne Cowppli-Bony, amd Florence Molinié, is quite cautious about the reasons for this higher rate, even using a Bayesian Poisson-Gamma smoothing (and the R package empbaysmooth), and citing the 1991 paper by Besag, York and Mollié, but the local and national medias are quick to blame the local industries for the difference. The graph above is particularly bad in that it accumulates mortality causes that are not mutually exclusive or independent. For instance, the much higher mortality rate due to alcohol is obviously responsible for higher rates of most other entries. And indicates a sociological pattern that may or may not be due to the type of job in the area, but differs from the more rural other parts of the Loire countries. (Which, like Brittany, are already significantly above (50%) the national reference for alcohol related health issues.), and may not be strongly connected to exposition to chemicals. For instance, the rates of pulmonary cancers are mostly comparable to the national average, if higher than the rest of the Loire countries and connect with a high smoking propensity. Lymphomas are not significantly different from the regional reference. The only type of cancer that can be directly attributed to working conditions are the mesothelioma, mostly caused by asbestos exposure, which was used in ship building, a specialty of the area. Among the many possible reasons for the higher mortality of the county, the study mentions a lower exposure to medical testings (connected with the sociological composition of the area). Which would indicate the most effective policies for lowering these higher cancer and mortality rates.

## Archive for Brittany

## poor statistics

Posted in Books, pictures, R, Statistics, Travel, Wines with tags alcoholism, Bretagne, Brittany, cancer, drugs, empbaysmooth, epidemiology, Julian Besag, Loire, Loire countries, R, R package, Saint-Nazaire on September 24, 2019 by xi'an## indecent exposure

Posted in Statistics with tags ABC, Bayesian optimisation, Bretagne, Brittany, exponential families, image analysis, image processing, inference, Lugano, maximum likelihood estimation, MCqMC 2018, pre-processing, Rennes on July 27, 2018 by xi'an**W**hile attending my last session at MCqMC 2018, in Rennes, before taking a train back to Paris, I was confronted by this radical opinion upon our previous work with Matt Moores (Warwick) and other coauthors from QUT, where the speaker, Maksym Byshkin from Lugano, defended a new approach for maximum likelihood estimation using novel MCMC methods. Based on the point fixe equation characterising maximum likelihood estimators for exponential families, when theoretical and empirical moments of the natural statistic are equal. Using a Markov chain with stationary distribution the said exponential family, the fixed point equation can be turned into a zero divergence equation, requiring simulation of pseudo-data from the model, which depends on the unknown parameter. Breaking this circular argument, the authors note that simulating pseudo-data that reproduce the observed value of the sufficient statistic is enough. Which is related with Geyer and Thomson (1992) famous paper about Monte Carlo maximum likelihood estimation. From there I was and remain lost as I cannot see why a derivative of the expected divergence with respect to the parameter θ can be computed when this divergence is found by Monte Carlo rather than exhaustive enumeration. And later used in a stochastic gradient move on the parameter θ… Especially when the null divergence is imposed on the parameter. In any case, the final slide shows an application to a large image and an Ising model, solving the problem (?) in 140 seconds and suggesting indecency, when our much slower approach is intended to produce a complete posterior simulation in this context.

## Rennes street art [#3]

Posted in pictures, Running, Travel with tags Brittany, graffitis, MCqMC 2018, Rennes, street art on July 24, 2018 by xi'an## unusual clouds [jatp]

Posted in pictures, Travel, Wines with tags ÌPA, beer, Brittany, Cité d'Ys, clouds, jatp, MCqMC 2018, Rennes, summer light, sunset, thunderstorm on July 19, 2018 by xi'an## a thread to bin them all [puzzle]

Posted in Books, Kids, R, Travel with tags Brittany, clouds, FiveThirtyEight, mathematical puzzle, meerkats, R, Rennes, sunset, The Riddler on July 9, 2018 by xi'an

**T**he most recent riddle on the Riddler consists in finding the shorter sequence of digits (in 0,1,..,9) such that all 10⁴ numbers between 0 (or 0000) and 9,999 can be found as a group of consecutive four digits. This sequence is obviously longer than 10⁴+3, but how long? On my trip to Brittany last weekend, I wrote an R code first constructing the sequence at random by picking with high preference the next digit among those producing a new four-digit number

tenz=10^(0:3) wn2dg=function(dz) 1+sum(dz*tenz) seqz=rep(0,10^4) snak=wndz=sample(0:9,4,rep=TRUE) seqz[wn2dg(wndz)]=1 while (min(seqz)==0){ wndz[1:3]=wndz[-1];wndz[4]=0 wndz[4]=sample(0:9,1,prob=.01+.99*(seqz[wn2dg(wndz)+0:9]==0)) snak=c(snak,wndz[4]) sek=wn2dg(wndz) seqz[sek]=seqz[sek]+1}

which usually returns a value above 75,000. I then looked through the sequence to eliminate useless replicas

for (i in sample(4:(length(snak)-5))){ if ((seqz[wn2dg(snak[(i-3):i])]>1) &(seqz[wn2dg(snak[(i-2):(i+1)])]>1) &(seqz[wn2dg(snak[(i-1):(i+2)])]>1) &(seqz[wn2dg(snak[i:(i+3)])]>1)){ seqz[wn2dg(snak[(i-3):i])]=seqz[wn2dg(snak[(i-3):i])]-1 seqz[wn2dg(snak[(i-2):(i+1)])]=seqz[wn2dg(snak[(i-2):(i+1)])]-1 seqz[wn2dg(snak[(i-1):(i+2)])]=seqz[wn2dg(snak[(i-1):(i+2)])]-1 seqz[wn2dg(snak[i:(i+3)])]=seqz[wn2dg(snak[i:(i+3)])]-1 snak=snak[-i] seqz[wn2dg(snak[(i-3):i])]=seqz[wn2dg(snak[(i-3):i])]+1 seqz[wn2dg(snak[(i-2):(i+1)])]=seqz[wn2dg(snak[(i-2):(i+1)])]+1 seqz[wn2dg(snak[(i-1):(i+2)])]=seqz[wn2dg(snak[(i-1):(i+2)])]+1}}

until none is found. A first attempt produced 12,911 terms in the sequence. A second one 12,913. A third one 12,871. Rather consistent figures but not concentrated enough to believe in achieving a true minimum. An overnight run produced 12,779 as the lowest value. Checking the answer the week after, it appears that 10⁴+3 *is* the correct answer!