I really like the models derived from capture-recapture experiments, because they encompass latent variables, hidden Markov process, Gibbs simulation, EM estimation, and hierarchical models in a simple setup with a nice side story to motivate it (at least in Ecology, in Social Sciences, those models are rather associated with sad stories like homeless, heroin addicts or prostitutes…) I was thus quite surprised to hear from many that the capture-recapture chapter in Bayesian Core was hard to understand. In a sense, I find it easier than the mixture chapter because the data is discrete and everything can [almost!] be done by hand…

Today I received an email from Cristiano about a typo in The Bayesian Choice concerning capture-recapture models:

“I’ve read the paragraph (4.3.3) in your book and I have some doubts about the proposed formula in example 4.3.3. My guess is that a typo is here, where (n-n_1) instead of n_2 should appear in the hypergeometric distribution.”

It is indeed the case! This mistake has been surviving the many revisions and reprints of the book and is also found in the French translation Le Choix Bayésien, in Example 4.19… In both cases, ${n_2 \choose n_2-n_{11}}$ should be ${n-n_1 \choose n_2-n_{11}}$, shame on me! (The mistake does not appear in Bayesian Core.)

to which I can only suggest to incorporate the error-in-variable structure, ie the possible confusion  in identifying individuals, within the model and to run a Gibbs sampler that simulates iteratively the latent variable” true numbers of individuals in captures 1 and 2″ and the parameters given those latent variables. This problem of counting the same individual twice or more has obvious applications in Ecology, when animals are only identified by watchers, as in whale sightings, and in Social Sciences, when individuals are lacking identification. [To answer specifically the overestimation question, this is clearly the case since $n_1$ and $n_2$ are larger than in truth, while $n_{11}$ presumably remains the same….]