

CURRICULUM IN CARDIOLOGY: STATISTICAL PAGES 

Year : 2016  Volume
: 2
 Issue : 3  Page : 187189 

The basics of Kaplan–Meier estimate
Aakshi Kalra
Institute of Economic Growth, Delhi University, New Delhi, India
Date of Web Publication  2Mar2017 
Correspondence Address: Aakshi Kalra Institute of Economic Growth, Delhi University, New Delhi India
Source of Support: None, Conflict of Interest: None  Check 
DOI: 10.4103/23955414.201381
How to cite this article: Kalra A. The basics of Kaplan–Meier estimate. J Pract Cardiovasc Sci 2016;2:1879 
The success of the intervention in any clinical or communitybased study is measured over a period by the number of participants alive or prevented from having the adverse events including death. However, conventionally not all participants remain in the study until its completion; a few tend to drop out, become unavailable or lost to followup, and so forth. In such a case, so as to come up with conclusions that concern the survivability of the participants, Kaplan–Meier estimate (also called “productlimit method”) serves as a simple, reliable measure. As the name is suggestive, it was developed by Kaplan and Meier in 1958.^{[1]}
Kaplan–Meier test is nonparametric in nature typically used for estimating the survival distribution, that is, to compute the fraction of participants who survived for a certain specified period after the intervention or treatment. It allows the estimation of survival over time even when the participants drop out or are studied for different time lengths. Generally, when there is loss of participant, the proportion of survival decreases. Thus, the curve steps down at each loss and is flat in between, which leads to appear as of a staircase depicting every lost participant.
As other statistical tests, this estimate is also based on certain assumptions. The equation that gives Kaplan–Meier estimate is as follows.
For any of the t time periods, S (t_{i}) is the estimated survival probability. Here, d_{j} is the dead patients and n_{j} is the atrisk patients during the time period t_{i}.
Concept behind Kaplan–Meier Test   
Before describing further, it would be important to understand the concept of censoring first. All those participants who are lost to followup or they drop out of the study or if the study ends before they die or have an outcome of interest come under the ambit of censored cases. It will be important to remember that the results may be biased if the dropout is related to both outcome and treatment.
For each interval, survival probability is calculated as patients surviving divided by patients at risk. The denominator does not include “censored” participants. The data of the participants are stored using dates and time. The probability of surviving to any point is estimated from cumulative probability of surviving each of the preceding time intervals, that is, calculated as the product of preceding probabilities.^{[2]} Although the probability calculated at any given interval is not very accurate because of the small number of events, the overall probability of surviving to each point is more accurate. It is based on estimating conditional probabilities at each time point the event occurs. Plotting confidence intervals can be useful in visualizing the differences in survival curves. The involved mathematical computations are beyond the scope of this article.
Interpretation of Kaplan–Meier Curves   
The lengths of the horizontal lines along the Xaxis of serial times represent the survival duration for that interval. Vertical axis represents estimated probability of survival. Precision of estimates depends on the number of observations. Survival estimates can be unreliable toward the end of a study when there are small numbers of subjects at risk of having an event.
Nonetheless, it is important to note that this test does not control for covariates and requires categorical predictors. It also cannot accommodate for timedependent variables.^{[3],[4],[5],[6],[7]}
Understanding Through Examples   
The description can be best illustrated by an example. Consider a hypothetical longitudinal research study involving 10 endstage heart failure patients who were followed up for 6 months to identify how many survived for the 1^{st} month, 2 months, and so forth. Ideally, data of all ten patients should be available at the end of the study, but this may not hold true in practice. The estimation of survival probabilities may differ at every followup depending on the number of participants retained in the study at that point.
As shown in [Figure 1], of the 10 participants, one died after 2 months of initiation of the study and another 2 participants dropped out at the end of 4 months. By the concept of censored cases, the probability of surviving up to the 1^{st} month is 100% = 10/10, but fraction surviving beyond 2 months is 9/10. Similar calculations can be done for each month, but these would be cumulative. There are many online free to access websites which provide calculators for Kaplan–Meier estimate. One of the websites is VassarStats, a website for statistical computation (http://vassarstats.net).^{[8]} Quoting the same example as cited above and illustrating in VassarStats; on the website, under the section of “Clinical Research Calculators,” click on “Kaplan–Meier Survival Probability Estimates.” Then, a prompt appears on the screen regarding time period. Fill in the time intervals/endpoints according to the study and data related to participants consequently. This will provide calculated survival probabilities with confidence intervals as shown in [Figure 2].  Figure 2: Kaplan–Meier estimate example results calculated on VassarStats website calculator.
Click here to view 
Source: Table template from VassarStats website.
There is another calculator “MedCalc” with a free trial version (https://www.medcalc.org/manual/kaplanmeier.php).^{[9]} Consider the same aforesaid example (arbitrary data) but with two patient groups 1 and 2. [Figure 3] shows the data of 10 participants distributed across 2 groups of comparison. The data are entered as patient group in column 1, survival time period in column 2, and the last column for censored cases (marked 0 for censored else 1 for those who reached the endpoint). Time period for the study was 6 months. Upon entering data, go to survival analysis and under that Kaplan–Meier test. There would be a prompt for assigning columns under survival time, endpoint, and factor. This will produce a survival curve along with other details [Figure 4]. Detailed process is given on the website of MedCalc.
Conclusion   
The Kaplan–Meier test is a descriptive nonparametric estimate of the survival function, which takes all observations including failures and censored into consideration. It is commonly used to describe survivorship of study population and frequently used to compare two study populations through graphical presentation.
Financial support and sponsorship
Nil.
Conflicts of interest
There are no conflicts of interest.
References   
1.  Kaplan EL, Meier P. Nonparametric estimation from incomplete observations. J Am Stat Assoc 1958;53:457. 
2.  Rich JT, Neely JG, Paniello RC, Voelker CC, Nussenbaum B, Wang EW. A practical guide to understanding KaplanMeier curves. Otolaryngol Head Neck Surg 2010;143:3316. 
3.  Sedgwick P, Joekes K. KaplanMeier survival curves: Interpretation and communication of risk. BMJ 2013;347:f7118. 
4.  Goel MK, Khanna P, Kishore J. Understanding survival analysis: KaplanMeier estimate. Int J Ayurveda Res 2010;1:2748. [ PUBMED] 
5.  
6.  
7.  Altman DG. Practical Statistics for Medical Research. Boca Raton, Florida: Chapman and Hall/CRC; 1999. p. 611. 
8.  Lowry R. Clinical Research Calculators. VassarStats: Website for Statistical Computation. Available from: http://www.vassarstats.net/. [Last accessed on 2016 Nov 28]. 
9.  
[Figure 1], [Figure 2], [Figure 3], [Figure 4]
