In order for a factor C to be a confounder in this relationship between E and D (purple arrow) it should fulfill three conditions: (1) it should be a risk factor for the disease D (orange arrow); (2) it should be associated with the exposure E (i.e. the risk factor of interest) (blue arrows) and (3) it may not be an effect of the exposure (or be part of the causal pathway). To decide whether a factor may be a confounder, it is therefore essential to have sufficient knowledge of (potential) biological relationships.

A relatively simple example of confounding is the following. If we would study a potential relationship between grey hair and death (Figure 2) and we would find that the presence of grey hair would multiply the risk of death by 10, we could ask ourselves if age is a confounder. We then know that (1) age is a risk factor for death; that (2) there is an association between age and gray hair and that (3) age is not an effect of grey hair. This suggests that age may indeed confound the relationship between gray hair and death. In other words, the ‘real’ effect on death is probably caused by age and not as much by grey hair.

If another investigator would be interested in the relationship between age and death, he could wonder if grey hair is confounding that relationship (Figure 3).

We know that (1) grey hair is not a ‘real’ risk factor for death; that (2) grey hair does have a relationship with age, but also that (3) grey hair is an effect of age. Therefore, grey hair is not a confounder in the relationship between age and death. In a case like this, adjustment for grey hair that is not a confounder would take away the real effect of age to at least some extent.
In medicine the aim of statistical modeling is often to unravel the aetiology of diseases (or other outcomes), and in these cases it is crucial to diagnose confounding in the way we described above and to adjust for it in order to estimate the real effect, i.e. the aetiological importance of an exposure E.
Other uses of statistical models include the prediction of a particular disease or outcome. Then it is the purpose of the investigators to make a prognostic model that predicts this disease or outcome as good as possible and in this case any factor that improves this prediction may be included in the prognostic model, regardless of the fact if it is a confounder or not. The risk estimates (e.g. relative risk, odds ratio etc) derived from such models, however, may not reflect anymore the aetiological importance of the factors included.
| For further reading |
| 1. Rothman K. Epidemiology: an introduction. Oxford University Press, 2002 |
| 2. Weinberg CR. Toward a clearer definition of confounding. Am J Epidemiol 1993; 137:1-8. |
| Kitty Jager |
| Managing Director of the ERA-EDTA Registry |