Formulas for sample size calculation differ depending on the type of study design and outcome(s) of interest. Sample size calculations are particularly of interest in the design of randomized controlled trials (RCTs). In RCTs a lot of money is invested and it is therefore important to be rather sure that enough patients are included to be able to obtain a statistically significant result. Usually in an RCT two groups are compared on a specific outcome and one wants to calculate the appropriate sample size for comparing those two groups of subjects.
In order to calculate the sample size, it is required to have some idea of the results expected in a study. One needs to specify the following quantities at the design stage of the study: 1) the P-value (alpha), 2) the power, 3) the smallest effect of interest, and 4) the variability. The P-value determines how likely it is that the observed effect in the study is due to chance and is most commonly fixed at 0.05 or occasionally at 0.01. The power is the chance of detecting, as statistically significant, a specified difference or effect if it really exists. Usually a power between 80% and 95% is chosen. The smallest effect of interest is the minimal difference between the studied groups that the investigator believes being clinically relevant and biologically plausible. Finally, the variability of the observations is expressed as the standard deviation.
In most studies investigators estimate the difference of interest and the standard deviation based on published data or on their own knowledge and opinion. This means that the calculation of an appropriate sample size partly relies on subjective choices or crude estimates of certain factors which may seem rather artificial to some. However, even if based on estimates and assumptions, a calculation is considerably more useful than a completely arbitrary choice.
Methods for sample size calculations are described in several general statistics textbooks, such as Altman (1991) or Bland (2000). Specialized books which discuss sample size determination in many situations were published by Machin et al. (1998) and Lemeshow et al. (1996). Sample size calculations for other study designs than RCTs are less common and more complicated and will in most cases require statistical assistance.
From Marlies Noordzij, ERA-EDTA Registry epidemiologist