*(This is the third post on Error and Inference, yet again being a raw and naïve reaction to a linear reading rather than a deeper and more informed criticism.)*

“Statistical knowledge is independent of high-level theories.”—A. Spanos, p.242,, 2010Error and Inference

**T**he sixth chapter of * Error and Inference* is written by Aris Spanos and deals with the issues of testing in econometrics. It provides on the one hand a fairly interesting entry in the history of economics and the resistance to data-backed theories, primarily because the buffers between data and theory are multifold (“

*huge gap between economic theories and the available observational data*“, p.203). On the other hand, what I fail to understand in the chapter is the meaning of theory, as it seems very distinct from what I would call a (statistical) model. The sentence “s

*tatistical knowledge, stemming from a statistically adequate model allows data to `have a voice of its own’ (…) separate from the theory in question and its succeeds in securing the frequentist goal of objectivity in theory testing*” (p.206) is puzzling in this respect. (Actually, I would have liked to see a clear meaning put to this “voice of its own”, as it otherwise sounds mostly as a catchy sentence…) Similarly, Spanos distinguishes between three types of models: primary/theoretical, experimental/structural: “

*the structural model contains a theory’s substantive subject matter information in light of the available data*” (p.213), data/statistical: “t

*he statistical model is built exclusively using the information contained in the data*” (p.213). I have trouble to understand how testing can distinguish between those types of models: as a naïve reader, I would have thought that only the statistical model could be tested by a statistical procedure, even though I would not call the above a proper definition of a statistical model (esp. since Spanos writes a few lines below that the statistical model “

*would embed (nest) the structural model in its context*” (p.213)). The normal example followed on pages 213-217 does not help

*[me]*to put sense to this distinction: it simply illustrates the impact of failing some of the defining assumptions (normality, time homogeneity [in mean and variance], independence). (As an aside, the discussion about the poor estimation of the correlation p.214-215 does not help, because it involves a second variable Y that is not defined for this example.) It would be nice of course if the “noise” in a statistical/econometric model could be studied in complete separation from the structure of this model, however they seem to be irremediably intermingled to prevent this partition of roles. I thus do not see how the “statistically adequate model is independent from the substantive information” (p.217), i.e. by which rigorous process one can isolate the “chance” parts of the data to build and validate a statistical model

*per se*. The simultaneous equation model (SEM, pp.230-231) is more illuminating of the distinction set by Spanos between structural and statistical models/parameters, even though the difference in this case boils down to a question of identifiability. Continue reading