Archive for covariate

capture-recapture with continuous covariates

Posted in Books, pictures, Statistics, University life with tags , , , , , on September 14, 2015 by xi'an

This morning, I read a paper by Roland Langrock and Ruth King in a 2013 issue of Annals of Applied Statistics that had gone too far under my desk to be noticed… This problem of using continuous variates in capture-recapture models is a frustrating one as it is not clear what one should do at times the subject and therefore its covariates are not observed. This is why I was quite excited by the [trinomial] paper of Catchpole, Morgan, and Tavecchia when they submitted it to JRSS Series B and I was the editor handling it. In the current paper Langrock and King build a hidden Markov model on the capture history (as in Jérôme Dupui’s main thesis paper, 1995), as well as a discretised Markov chain model on the covariates and a logit connection between those covariates and the probability of capture. (At first, I thought the Markov model was a sheer unconstrained Markov chain on the discretised space and found curious that increasing the number of states had a positive impact on the estimation but, blame my Métro environment!, I had not read the paper carefully.)

“The accuracy of the likelihood approximation increases with increasing m.” (p.1719)

While I acknowledge that something has to be done about the missing covariates, and that this approach may be the best one can expect in such circumstances, I nonetheless disagree with the above notion that increasing the discretisation step m will improve the likelihood approximation, simply because the model on the covariates that was chosen ex nihilo has no reason to fit the real phenomenon, especially since the value of the covariates impact the probability of capture: the individuals are not (likely to get) missing at random, i.e., independently from the covariates. For instance, in a lizard study on which Jérôme Dupuis worked in the early 1990’s, weight and survival were unsurprisingly connected, with a higher mortality during the cold months where food was sparse. Using autoregressive-like models on the covariates is missing the possibility of sudden changes in the covariates that could impact the capture patterns. I do not know whether or not this has been attempted in this area, but connecting the covariates between individuals at a specific time, so that missing covariates can be inferred from observed covariates, possibly with spatial patterns, would also make sense.

In fine, I fear there is a strong and almost damning limitation to the notion of incorporating covariates into capture-recapture models, namely, if a covariate is determinantal in deciding of a capture or non-capture, the non-capture range of the covariate will never be observed and hence cannot be derived from the observed values.

capture-recapture homeless deaths

Posted in Statistics, Travel, University life with tags , , , , , , , on August 28, 2014 by xi'an

Paris and la Seine, from Pont du Garigliano, Oct. 20, 2011In the newspaper I grabbed in the corridor to my plane today (flying to Bristol to attend the SuSTaIn image processing workshop on “High-dimensional Stochastic Simulation and Optimisation in Image Processing” where I was kindly invited and most readily accepted the invitation), I found a two-page entry on estimating the number of homeless deaths using capture-recapture. Besides the sheer concern about the very high mortality rate among homeless persons (expected lifetime, 48 years; around 7000 deaths in France between 2008 and 2010) and the dreadful realisation that there are an increasing number of kids dying in the streets, I was obviously interested in this use of capture-recapture methods as I had briefly interacted with researchers from INED working on estimating the number of (living) homeless persons about 15 years ago. Glancing at the original paper once I had landed, there was alas no methodological innovation in the approach, which was based on the simplest maximum likelihood estimate. I wonder whether or not more advanced models and [Bayesian] methods of inference could [or should] be used on such data. Like introducing covariates in the process. For instance, when conditioning the probability of (cross-)detection on the cause of death.