Models: there’s wrong and then there is Wrong

One of my favourite quotes is by George Box: All models are wrong, but some are useful. If you work with models and understand their place in the universe, you may already agree with this too. However, there is more than one type of wrong, and while it is not always possible to tell which is which when the milk has been spilt, the difference is important.

Models are always wong in that they aren’t a perfect replica of the “real thing” being modelled. Some may argue exceptions and that some models do perfectly model the underlying reality – I haven’t been convinced yet. The fundamental point is, if the model is the same as reality, what is the need for the model?

The purpose of most models is to provide a useful way of understanding an extremely complex system. Extremely complex systems are difficult to understand in their entirety. Economists are regularly getting bashed for throwing dangerous phrases like ceteris paribus around in their commentary and conclusions. Why the insistence on holding all other things equal? Because their model is only complex enough to understand a few components of reality and so is wrong when it comes to those other areas. This is problematic when those other things turn out to be important and unequal. The technical term for these models is “not useful”. I’ll give George the credit for this term too.

Nobody said it was going to be easy…

To build a useful model, that is. Understanding the benefits of modelling specifics components requires and in-depth, often intuitive feel for the problem at hand. A consultant brought in from the outside won’t necessarily have this unless the problem is a common or generic one. A good consultant will spend a significant amount of time listening and understanding the problem, the environment and the broader issues that will influence the real benefits drivers. Recognising the costs of modelling individual pieces of the problem is more a technical problem. Knowledge of model-building approaches, computer systems and applications, statistical techniques and actuarial projections, database management and data-mining, logical thought and system building all come into the process. Knowledge is required, but there’s often little substitute for experience too. Throw in some serious academic training too and we can start to hit Excel.

But what about the other Wrong?

The wrong I’ve discussed so far is a pretty mild sort of wrong. Intended, required, carefully thought through and ultimately useful. But what about Wrong in a simpler form. Wrong because a mistake was made? Wrong because a spreadsheet included errors? The real-world experience of model errors small and very, very large is compelling. Mistakes do happen. This post doesn’t deal with how to prevent reduce errors (plan, document, independent review etc.), but rather with how one classifies an error once it has been discovered.

A recent example I experienced was where a mistake had been made. Unfortunately for everyone it was one of the large, conspicuous and nasty types. The cause of the mistake could have been anything from incorrect proprietary models to incompetence, with lack of judgement, lack of review, weak control processes and lack of ownership of risk management protocols floating around somewhere in between. It is impossible to tell what was intended at the date the mistake was originally made since there is no record of what was intended, why it was done, how the decision was made, what checks were performed and who gave the thumbs up to go ahead. Unobserved and unrecorded history makes for compelling spy stories and thrillers, but not so great on the dry high school textbooks.

The little-known Wrong before other wrongs

Given the story above, the Wrong seems to be the lack of a clear objective stated at the outset, with clear understanding and documentation of this objective at the start. So often, the simple act of framing a problem correctly makes giant leaps towards it resolution. This is often the Wrong that precedes other wrongs:

  • Know what you are trying to do;
  • Make sure you understand why; and
  • Be clear and specific about describing it so that you and everyone else are on the same page.

Another of my favourite quotes is by George Bernard Shaw: “The single biggest problem in communication is the illusion that it has taken place“. That last bullet above isn’t as simple as it seems.

Reference for this post

Box, G.E.P., Robustness in the strategy of scientific model building, in Robustness in Statistics, R.L. Launer and G.N. Wilkinson, Editors. 1979, Academic Press: New York.