Archive for the Statistics Category

avernian landscapes (#6)

Posted in Mountains, pictures, Running, Statistics, Travel with tags , , , , , , , , , , , , , , on August 30, 2014 by xi'an

sancy4

high-dimensional stochastic simulation and optimisation in image processing [day #2]

Posted in pictures, Statistics, Travel, Uncategorized, University life, Wines with tags , , , , , , on August 30, 2014 by xi'an

After a nice morning run down Leigh Woods and on the muddy banks of the Avon river, I attended a morning session on hyperspectral image non-linear modelling. Topic about which I knew nothing beforehand. Hyperspectral images are 3-D images made of several wavelengths to improve their classification as a mixture of several elements. The non-linearity is due to the multiple reflections from the ground as well as imperfections in the data collection. I found this new setting of clear interest, from using mixtures to exploring Gaussian processes and Hamiltonian Monte Carlo techniques on constrained spaces… Not to mention the “debate” about using Bayesian inference versus optimisation. It was overall a day of discovery as I am unaware of the image processing community (being the outlier in this workshop!) and of their techniques. The problems mostly qualify as partly linear high-dimension inverse problems, with rather standard if sometimes hybrid MCMC solutions. (The day ended even more nicely with another long run in the fields of Ashton Court and a conference diner by the river…)

 

avernian landscapes (#5)

Posted in Mountains, pictures, Running, Statistics, Travel with tags , , , , , , on August 29, 2014 by xi'an

sancy11

high-dimensional stochastic simulation and optimisation in image processing [day #1]

Posted in pictures, Statistics, Travel, Uncategorized, University life, Wines with tags , , , , , , , , , , , on August 29, 2014 by xi'an

Even though I flew through Birmingham (and had to endure the fundamental randomness of trains in Britain), I managed to reach the “High-dimensional Stochastic Simulation and Optimisation in Image Processing” conference location (in Goldney Hall Orangery) in due time to attend the (second) talk by Christophe Andrieu. He started with an explanation of the notion of controlled Markov chain, which reminded me of our early and famous-if-unpublished paper on controlled MCMC. (The label “controlled” was inspired by Peter Green who pointed out to us the different meanings of controlled in French [meaning checked or monitored] and in English . We use it here in the English sense, obviously.) The main focus of the talk was on the stability of controlled Markov chains. With of course connections with out controlled MCMC of old, for instance the case of the coerced acceptance probability. Which happened to be not that stable! With the central tool being Lyapounov functions. (Making me wonder whether or not it would make sense to envision the meta-problem of adaptively estimating the adequate Lyapounov function from the MCMC outcome.)

As I had difficulties following the details of the convex optimisation talks in the afternoon, I eloped to work on my own and returned to the posters & wine session, where the small number of posters allowed for the proper amount of interaction with the speakers! Talking about the relevance of variational Bayes approximations and of possible tools to assess it, about the use of new metrics for MALA and of possible extensions to Hamiltonian Monte Carlo, about Bayesian modellings of fMRI and of possible applications of ABC in this framework. (No memorable wine to make the ‘Og!) Then a quick if reasonably hot curry and it was already bed-time after a rather long and well-filled day!z

capture-recapture homeless deaths

Posted in Statistics, Travel, University life with tags , , , , , , , on August 28, 2014 by xi'an

Paris and la Seine, from Pont du Garigliano, Oct. 20, 2011In the newspaper I grabbed in the corridor to my plane today (flying to Bristol to attend the SuSTaIn image processing workshop on “High-dimensional Stochastic Simulation and Optimisation in Image Processing” where I was kindly invited and most readily accepted the invitation), I found a two-page entry on estimating the number of homeless deaths using capture-recapture. Besides the sheer concern about the very high mortality rate among homeless persons (expected lifetime, 48 years; around 7000 deaths in France between 2008 and 2010) and the dreadful realisation that there are an increasing number of kids dying in the streets, I was obviously interested in this use of capture-recapture methods as I had briefly interacted with researchers from INED working on estimating the number of (living) homeless persons about 15 years ago. Glancing at the original paper once I had landed, there was alas no methodological innovation in the approach, which was based on the simplest maximum likelihood estimate. I wonder whether or not more advanced models and [Bayesian] methods of inference could [or should] be used on such data. Like introducing covariates in the process. For instance, when conditioning the probability of (cross-)detection on the cause of death.

understanding the Hastings algorithm

Posted in Books, Statistics with tags , , , , , on August 26, 2014 by xi'an

David Minh and Paul Minh [who wrote a 2001 Applied Probability Models] have recently arXived a paper on “understanding the Hastings algorithm”. They revert to the form of the acceptance probability suggested by Hastings (1970):

\rho(x,y) = s(x,y) \left(1+\dfrac{\pi(x) q(y|x)}{\pi(y) q(x|y)}\right)^{-1}

where s(x,y) is a symmetric function keeping the above between 0 and 1, and q is the proposal. This obviously includes the standard Metropolis-Hastings form of the ratio, as well as Barker’s (1965):

\rho(x,y) = \left(1+\dfrac{\pi(x) q(y|x)}{\pi(y) q(x|y)}\right)^{-1}

which is known to be less efficient by accepting less often (see, e.g., Antonietta Mira’s PhD thesis). The authors also consider the alternative

\rho(x,y) = \min(\pi(y)/ q(y|x),1)\,\min(q(x|y)/\pi(x),1)

which I had not seen earlier. It is a rather intriguing quantity in that it can be interpreted as (a) a simulation of y from the cutoff target corrected by reweighing the previous x into a simulation from q(x|y); (b) a sequence of two acceptance-rejection steps, each concerned with a correspondence between target and proposal for x or y. There is an obvious caveat in this representation when the target is unnormalised since the ratio may then be arbitrarily small… Yet another alternative could be proposed in this framework, namely the delayed acceptance probability of our paper with Marco and Clara, one special case being

\rho(x,y) = \min(\pi_1(y)q(x|y)/\pi_1(x) q(y|x),1)\,\min(\pi_2(y)/\pi_1(x),1)

where

\pi(x)\propto\pi_1(x)\pi_2(x)

is an arbitrary decomposition of the target. An interesting remark in the paper is that any Hastings representation can alternatively be written as

\rho(x,y) = \min(\pi(y)/k(x,y)q(y|x),1)\,\min(k(x,y)q(x|y)/\pi(x),1)

where k(x,y) is a (positive) symmetric function. Hence every single Metropolis-Hastings is also a delayed acceptance in the sense that it can be interpreted as a two-stage decision.

The second part of the paper considers an extension of the accept-reject algorithm where a value y proposed from a density q(y) is accepted with probability

\min(\pi(y)/ Mq(y),1)

and else the current x is repeated, where M is an arbitrary constant (incl. of course the case where it is a proper constant for the original accept-reject algorithm). Curiouser and curiouser, as Alice would say! While I think I have read some similar proposal in the past, I am a wee intrigued at the appear of using only the proposed quantity y to decide about acceptance, since it does not provide the benefit of avoiding generations that are rejected. In this sense, it appears as the opposite of our vanilla Rao-Blackwellisation. (The paper however considers the symmetric version called the independent Markovian minorizing algorithm that only depends on the current x.) In the extension to proposals that depend on the current value x, the authors establish that this Markovian AR is in fine equivalent to the generic Hastings algorithm, hence providing an interpretation of the “mysterious” s(x,y) through a local maximising “constant” M(x,y). A possibly missing section in the paper is the comparison of the alternatives, albeit the authors mention Peskun’s (1973) result that exhibits the Metropolis-Hastings form as the optimum.

NIPS workshops (Dec. 12-13, 2014, Montréal)

Posted in Kids, Statistics, Travel, University life with tags , , , , , , , , on August 25, 2014 by xi'an

Run_ABCFollowing a proposal put forward by Ted Meeds, Max Welling,  Richard Wilkinson, Neil Lawrence and myself, our ABC in Montréal workshop has been accepted by the NIPS 2014 committee and will thus take place on either Friday, Dec. 11, or Saturday, Dec. 12, at the end of the main NIPS meeting (Dec. 8-10). (Despite the title, this workshop is not part of the ABC in … series I started five years ago. It will only last a single day with a few invited talks and no poster. And no free wine & cheese party.) On top of this workshop, our colleagues Vikash K Mansinghka, Daniel M Roy, Josh Tenenbaum, Thomas Dietterich, and Stuart J Russell have also been successful in their bid for the 3rd NIPS Workshop on Probabilistic Programming which will presumably be held on the opposite day to ours, as Vikash is speaking at our workshop, while I am speaking in this workshop. I am yet undecided as to whether or not to attend the main conference, given that I am already travelling a lot this semester and have to teach two courses, incl. a large undergraduate statistics inference course… Obviously, I will try to attend if our joint paper is accepted by the editorial board! Even though Marco will then be the speaker.

Follow

Get every new post delivered to your Inbox.

Join 634 other followers