

MISCELLANEOUS  STATISTICS 

Year : 2015  Volume
: 1
 Issue : 3  Page : 285288 

Preeminence and prerequisites of sample size calculations in clinical trials
Richa Singhal, Rakesh Rana
Central Council for Research in Ayurvedic Sciences, Ministry of AYUSH, Government of India, New Delhi, India
Date of Web Publication  23Feb2016 
Correspondence Address: Richa Singhal Central Council for Research in Ayurvedic Sciences, Ministry of AYUSH, Government of India, Janakpuri, New Delhi India
Source of Support: None, Conflict of Interest: None  Check 
DOI: 10.4103/23955414.177301
The key components while planning a clinical study are the study design, study duration, and sample size. These features are an integral part of planning a clinical trial efficiently, ethically, and costeffectively. This article describes some of the prerequisites for sample size calculation. It also explains that sample size calculation is different for different study designs. The article in detail describes the sample size calculation for a randomized controlled trial when the primary outcome is a continuous variable and when it is a proportion or a qualitative variable. Keywords: Power and significance level, randomized controlled trials, sample size
How to cite this article: Singhal R, Rana R. Preeminence and prerequisites of sample size calculations in clinical trials. J Pract Cardiovasc Sci 2015;1:2858 
How to cite this URL: Singhal R, Rana R. Preeminence and prerequisites of sample size calculations in clinical trials. J Pract Cardiovasc Sci [serial online] 2015 [cited 2020 Aug 3];1:2858. Available from: http://www.jpcs.org/text.asp?2015/1/3/285/177301 
Introduction   
Sample size is the desired number of units to be included in the study which assures that the intended study will have the desired power for correctly detecting a clinically meaningful difference in the parameter under study if such a difference truly exists. The first and the foremost step in designing a study is to calculate the sample size, and henceforth, the most frequent requests that the statisticians receive form the investigators is regarding the sample size calculation. The sample size must be determined and specified in the study protocol before recruitment starts.
Although most statistical textbooks and many online calculators available via World Wide Web describe techniques for sample size calculation, it is often difficult for investigators to decide which method to use. There are many formulas available which can be applied for different types of data and study designs. However, investigators should be cautious while using theses formulas and should only use them, when they have the complete knowledge about the study design and which particular formula has to be applied in which situation, as these formulas are sensitive to errors, and small differences in selected parameters can lead to large differences in the sample size.
Why Adequate Sample Size Calculation is the Necessity of the Hour?   
The main purpose of adequate sample size calculation is to detect the clinically meaningful treatment effect if it actually exists. For a properly planned scientific study, it is mandatory that adequate sample size calculation has been made before its initiation. While calculating the sample size, statisticians have to take care that the calculations have been made taking into consideration the ethical, cost and time constraints. ^{[1],[2]} Since, the sample size that is too large may waste time, resources and money, whereas the number that is too small may fail in detecting the relevant existing effect. Hence, an optimum number of patients must be included in the study. In the current scenario, the sample size must be calculated while designing and planning the study as many ethical committees also demand the same before approving the research projects.
Prerequisites for Sample Size Calculation   
Significance level denoted as α or the Type I error
Is the error committed by wrongly rejecting the null hypothesis when it is true. That is, it occurs when the results of the research study show that a difference exists, while in truth there is no difference. In other words, alpha (α) represents the chance of falsely rejecting the null hypothesis of no difference and picking up the falsepositive result. The α is most commonly fixed at 0.05, which means that the researcher desires a <5% chance of drawing a falsepositive conclusion.
Power
The probability of making Type II error that is accepting the null hypothesis when it is false is called beta (β) or drawing a false negative conclusion. In other words, concluding that there is no difference between the two groups or treatments when in reality there is a difference. The quantity (1−β) is called power. Power is the probability of detecting the effect when it actually exists. Or it is the probability of accepting the alternative hypothesis when it is true or correctly rejecting the null hypothesis when it is false. For calculation of sample size one need to know the power of the study. The power is the complement of β: 1β. Hence, in the case of a β of 0.20, the power would be 0.80 or 80%.
It may be noted that, that the values of α and β must be set in the study according to the research question. Researchers must set a low value of α level if they wanted to minimize the Type I error, and similarly, a low value of β should be set if the researcher want to minimize the Type II error. Many studies set α at 0.05 and β at 0.20 (a power of 0.80). These are somewhat arbitrary values, and others are also sometimes used; the conventional range for α is between 0.01 and 0.10; and for β, between 0.05 and 0.20.
Delta (δ) or the smallest effect of interest
The smallest effect of interest is the minimal difference between the studied groups that the investigator wishes to detect and is often referred to as the minimal clinically relevant difference. This should be a difference that the investigator believes to be clinically relevant and biologically plausible. For example, for continuous outcome variables, the minimal clinically relevant difference is a numerical difference. For instance, if cholesterol level is the outcome of a trial, an investigator could choose a difference of 20 mg/dl in the cholesterol level as the minimal clinically relevant difference. In a trial with a binary outcome, for example the effect of a drug on the development of a myocardial infarction (yes/no), an investigator should estimate a relevant difference between the event rates in both treatment groups and could choose, for instance, a difference of 15% between the treatment group and the control group as minimal clinically relevant difference. Even a small change in the expected difference with treatment has a major effect on the estimated sample size, as the sample size is inversely proportional to the square of the difference.
The variability
Finally, the sample size calculation is based on using the population variance of a given outcome variable that is estimated by means of the standard deviation (SD) in the case of a continuous outcome. Because the variance is usually an unknown quantity, researchers often use an estimate obtained from a pilot study or use information from a previously published study.
Calculating Sample Size for a Randomized Controlled Trial When the Outcome Variable is Continuous or Quantitative   
Randomized controlled trials are generally the study designs in which the researcher wishes to interrogate the effect of a new drug on a particular disease condition as compared to the standard treatment available for that disease. For example, if the researcher wish to know the effect of a new intervention on lowering the cholesterol level in the patients of dyslipidemia as compared to the effect of standard drug available for it, than he/she will plan a study with two groups of patients and in one group new drug will be given while in the other the standard drug will be prescribed. After giving these drugs for a fixed time period cholesterol levels of both groups will be measured and mean cholesterol levels of both groups will be compared to see if the difference is significant or not. The procedure for calculation of sample size in clinical trials/intervention studies involving two groups is described in this article. ^{[3],[4]} If the outcome variable is a quantitative or continuous variables such as cholesterol level, blood pressure, and hemoglobin, then formula 1 can be used for calculating the sample size:
In this formula:
Z_{α/2} is the value of α level taken from Ztables for a twotailed test which is normally set at 5%, hence Z_{α/2} = Z _{0.05/2} = Z _{0.025} = 1.96 (From the Ztables).
Z_{β} is the value of the power of the test which is generally set at 80% and its value from the Ztables is 0.8416.
ó^{2} is the SD taken from the previously published studies.
And difference (d) is the minimal difference between the means of two groups that the investigators wish to investigate.
For example, considering a randomized control trial for testing the effect of a cholesterollowering new drug as compared to the standard drug, Researcher assumes that if the new drug lowers the cholesterol level by 20 mg/dl as compared to the standard drug than it would be considered as clinically relevant. Considering that new drug lowers the cholesterol level by 50 mg/dl and standard drug loers it by 30 mg/dl over a period of time. He assumes the SD as 45 mg/dl based on the results of the previously published studies. If the researcher selects 5% level of significance and 80% power than substituting all these values in the above formula.
That is a total of 80 patients per group have to be enrolled in the clinical trial.
Calculating Sample Size for a Randomized Controlled Trial When the Outcome Variable is Proportion or Qualitative   
If the primary outcome of the clinical study is a qualitative variable with binary outcome such as yes/no, alive/dead, diseased/nondiseased, then the formula 2 must be used for calculating the sample size for a two group study:
here again Z_{α/2} is the value of α level taken from Ztables for a twotailed test which is normally set at 5%, hence Z_{α/2} = Z _{0.05/2} = Z _{0.025} = 1.96 (From the Ztables) and
Z_{β} is the value of the power of the test which is generally set at 80% and its value from the Ztables is 0.8416.
p_{1} is the prevalence in Group I and p_{2} is the prevalence in Group II.
is the pooled prevalence rate in both the groups, i.e., = (p_{1} + p_{2} )/2.
Example, suppose an investigator knows from the previously published literature that 35% of the patients die due to myocardial infection when prescribed the current standard treatments. He assumes that the new drug will reduce this to 20% and hence says that the difference of 15% in the event rate of both groups is considered as a clinically meaningful difference. Here p_{1} = 35%, p_{2} = 20%, = (0.20 + 0.35)/2 = 0.275, at the conventional 5% level of significance and 80% power the sample size would be:
Therefore, 140 patients per group have to be recruited in the trial.
Discussion   
The sample size calculation is an essential part of designing a study protocol. The justification for choosing a particular sample size should be given properly in the research papers and reports . This article describes only some of the methods of sample size calculation for some common research designs. It may be noted that the sample size calculations are different for different study designs such as casecontrol studies, crosssectional studies, and cohort studies. It is recommended that clinicians should do sample size calculation in consultation with the statisticians and should survey thoroughly about the prerequisites of the sample size calculation such as clinically meaningful difference and outcome variable of interest, which requires expert medical knowledge.
Sample size calculations could also be done by using various softwares like nQuery, n master, PASS, etc., but these softwares usually required a paid license. However, the calculation could also be performed by using online calculators available on internet. Links to one of these online calculators is:
For continuous outcome:
http://www.powerandsamplesize.com/Calculators/Compare2Means/2SampleEquality.
For proportions:
http://www.powerandsamplesize.com/Calculators/Compare2Proportions/2SampleEquality.
Another calculator with a worked example [Figure 1]:
@ http://www.selectstatistics.co.uk/samplesize calculatortwomeans.
To measure the effectiveness of the therapies of the treatments using arterial pressures and detect a difference of at least 14 mmHg between the two groups (the standard deviation of the two groups being 20 mmHg, with a variance of 400 mmHg). To detect a difference of this magnitude that is significant with 95% confidence and a power of 80%, the clinicians will require 33 patients in each group [Figure 1].
The formula used is:
where Z_{α/2} is the critical value of the normal distribution at α/2 (e.g., for a confidence level of 95%, α is 0.05, and the critical value is 1.96), Z_{β} is the critical value of the normal distribution at β (e.g., for a power of 80%, β is 0.2 and the critical value is 0.84), σ^{2} is the population variance, and d is the different you would like to detect.
Conclusion   
Studies must be adequately powered to achieve their objectives, and appropriate sample size calculations should be carried out at the designing stage. Estimation of the expected size of effect is based on existing evidence and clinical expertise. It is important that any estimates be large enough to be clinically important while also remaining plausible. One must remember that an underpowered study will not be valid. Remember: "absence of evidence is not evidence of absence."
Financial support and sponsorship
Nil.
Conflicts of interest
There are no conflicts of interest.
References   
1.  Altman DG. Statistics and ethics in medical research: III How large a sample? Br Med J 1980;281:13368. 
2.  Altman DG. Practical Statistics for Medical Research. London, UK: Chapman and Hall; 1991. 
3.  Florey CD. Sample size for beginners. BMJ 1993;306:11814. 
4.  Wittes J. Sample size calculations for randomized controlled trials. Epidemiol Rev 2002;24:3953. 
[Figure 1]
