Examples of misclassification of disease status are: the lack of recording a myocardial infarction whereas the patient did have a heart attack or wrongfully coding a transplant patient as being lost to follow-up because his death was not reported. Misclassification of disease status may be due to incorrect (or missed) diagnosis, patient’s self-report and, again, incorrect coding. In all such cases exposure or disease are misclassified, that is being placed in the wrong category.
Misclassification of exposure may be different in persons with the disease and persons without the disease. Vice versa, misclassification of disease may be different in persons with and without the exposure. This is called differential misclassification. If misclassification does not depend on disease or exposure status, it is called non-differential misclassification. Most epidemiological studies suffer from non-differential misclassification to some extent, if only for the fact that there is always some random measurement error or some mistakes made in coding. Suppose we would study the relationship between salt-intake and the occurrence of chronic renal failure (CRF). The table shows the data of a hypothetical case control study. The left column shows the raw data and the odds ratio (= relative risk) had all the members been correctly classified. The data suggest an odds ratio of 4.8 meaning that the risk of CRF is almost 5 times as high in the individuals with a high salt diet. The right column shows the case where data have been misclassified in two directions; 20% of individuals on high salt diets have been misclassified as not being on a high salt diet and vice versa. Non-differential misclassification always results in a dilution of the effect. The odds ratio is now 2.4, which is an underestimation of the real effect.
|
Correct classification |
Non-differential misclassification |
||
|
High salt diet |
High salt diet |
||
|
yes |
No |
Yes |
No |
CRF |
130 |
100 |
130-26+20=124 |
100+26-20=106 |
Controls |
60 |
220 |
60-12+44=92 |
220+12-44=188 |
Odds ratio |
130 x 220 = 4.8 |
124 x 188 = 2.4 |
||
Table: Non-differential misclassification in a hypothetical case control study
A common form of information bias is recall bias. This may occur in case-control studies, where a subject may be interviewed to acquire exposure information after disease has already occurred. The classical example is the use of interviews to obtain exposure information in mothers who gave birth to babies with birth defects. Mothers of such babies may better remember fevers, the use of drugs or other potentially relevant events than mothers who gave birth to healthy babies, simply because the latter did not go through their memories over and over again to try and find a potential reason for their child’s health problem. In this case the misclassification does depend on whether the mother had a baby with a birth defect or not and it was therefore differential misclassification. Differential misclassification can either overestimate or underestimate an effect.
| For further reading |
| 1. Rothman K. Epidemiology: an introduction. Oxford University Press, 2002 |
| Kitty Jager |
| Managing Director of the ERA-EDTA Registry |