## capture-recapture with continuous covariates

**T**his morning, I read a paper by Roland Langrock and Ruth King in a 2013 issue of Annals of Applied Statistics that had gone too far under my desk to be noticed… This problem of using continuous variates in capture-recapture models is a frustrating one as it is not clear what one should do at times the subject and therefore its covariates are not observed. This is why I was quite excited by the [trinomial] paper of Catchpole, Morgan, and Tavecchia when they submitted it to JRSS Series B and I was the editor handling it. In the current paper Langrock and King build a hidden Markov model on the capture history (as in Jérôme Dupui’s main thesis paper, 1995), as well as a discretised Markov chain model on the covariates and a logit connection between those covariates and the probability of capture. (At first, I thought the Markov model was a sheer unconstrained Markov chain on the discretised space and found curious that increasing the number of states had a positive impact on the estimation but, blame my Métro environment!, I had not read the paper carefully.)

“The accuracy of the likelihood approximation increases with increasing m.” (p.1719)

While I acknowledge that something has to be done about the missing covariates, and that this approach may be the best one can expect in such circumstances, I nonetheless disagree with the above notion that increasing the discretisation step m will improve the likelihood approximation, simply because the model on the covariates that was chosen *ex nihilo* has no reason to fit the real phenomenon, especially since the value of the covariates impact the probability of capture: the individuals are not (likely to get) missing at random, i.e., independently from the covariates. For instance, in a lizard study on which Jérôme Dupuis worked in the early 1990’s, weight and survival were unsurprisingly connected, with a higher mortality during the cold months where food was sparse. Using autoregressive-like models on the covariates is missing the possibility of sudden changes in the covariates that could impact the capture patterns. I do not know whether or not this has been attempted in this area, but connecting the covariates between individuals at a specific time, so that missing covariates can be inferred from observed covariates, possibly with spatial patterns, would also make sense.

*In fine*, I fear there is a strong and almost damning limitation to the notion of incorporating covariates into capture-recapture models, namely, if a covariate is determinantal in deciding of a capture or non-capture, the non-capture range of the covariate will never be observed and hence cannot be derived from the observed values.

September 17, 2015 at 8:22 am

Thanks very much for discussing our paper! I’m not sure if I agree regarding the damning limitation you describe. Say we extend our model such that not only survival, but also capture probability depends on the covariate (straightforward to do). Now let’s say that in the study there is a bunch of individuals that experience at sudden drop in their covariate values at some point, and as a consequence all of these aren’t captured anymore subsequent to that change (but are still alive). Provided that the model correctly picks up the relation between covariate and BOTH survival and capture, it should in principle be able to pick up the major change in the covariate values (e.g. since with the earlier higher covariate values it would have been unlikely for all these individuals to die).

So my intuition is that the information is there, you just need to formulate an adequate model for the evolution of the covariates, in particular allowing for sudden changes (i.e. not just a simple AR-type process like the one we use) – easier to say than to do, but I do feel that it should in principle be possible. However, information on the behaviour of the covariate process at low values would of course be very limited. And there might be identifiability problems.

September 17, 2015 at 8:30 am

Sorry if “damning” sounded aggressive! My issue is with the general notion of having a model where capture or censoring or missingness, whatever, depending very much on a specific covariate on which we have no prior information or hardly any. Given that we observe that covariate only for non-missing data, this is a case of “missing not at random”. Which requires a more involved joint modelling of the capture x covariate system, as you mention.

September 17, 2015 at 8:35 am

No worries, it didn’t sound aggressive at all – I didn’t even read it as a criticism of our approach, but as a remark on a general problem with this type of data.

September 14, 2015 at 1:33 am

This is true really of any point process or anything that requires a covariate “surface” rather than the values of covariates at a known finite set of points. (Basically I’m saying that this isn’t a damning feature if CR models so much as a basic feature of any model with temporal or spatial covariates [or really any other partially observed set of covariates])

We do what we can.

I’m a fan of the joint modelling approach (although I’ll just fudge it and use a spline smoother when I can get away with it), but, as with any other prediction problem, we will only predict things that are in some way similar to what we’ve seen.

This actually leads into some interesting questions about covariate quality in these types of datasets. Often things have been pstprocessed to within an inch of their life before you get hold of them.

September 14, 2015 at 6:13 pm

No disagreement about this, but… capture-recapture models still have this fairly special feature that the capture event is impacted by the covariates themselves, thus that the covariate values are never observed in the case of a non-capture. “We do what we can” should be “we model what we canned, but what of those un-canned?!”