CURRICULUM IN CARDIOLOGY: STATISTICAL PAGES
Year
: 2016  |  Volume : 2  |  Issue : 3  |  Page : 187--189

The basics of Kaplan–Meier estimate

Aakshi Kalra
Institute of Economic Growth, Delhi University, New Delhi, India

Aakshi Kalra
Institute of Economic Growth, Delhi University, New Delhi
India

 How to cite this article: Kalra A. The basics of Kaplan–Meier estimate.J Pract Cardiovasc Sci 2016;2:187-189

 How to cite this URL: Kalra A. The basics of Kaplan–Meier estimate. J Pract Cardiovasc Sci [serial online] 2016 [cited 2021 Sep 16 ];2:187-189 Available from: https://www.j-pcs.org/text.asp?2016/2/3/187/201381

Full Text

The success of the intervention in any clinical- or community-based study is measured over a period by the number of participants alive or prevented from having the adverse events including death. However, conventionally not all participants remain in the study until its completion; a few tend to drop out, become unavailable or lost to follow-up, and so forth. In such a case, so as to come up with conclusions that concern the survivability of the participants, Kaplan–Meier estimate (also called “product-limit method”) serves as a simple, reliable measure. As the name is suggestive, it was developed by Kaplan and Meier in 1958.[1]

Kaplan–Meier test is nonparametric in nature typically used for estimating the survival distribution, that is, to compute the fraction of participants who survived for a certain specified period after the intervention or treatment. It allows the estimation of survival over time even when the participants drop out or are studied for different time lengths. Generally, when there is loss of participant, the proportion of survival decreases. Thus, the curve steps down at each loss and is flat in between, which leads to appear as of a staircase depicting every lost participant.

As other statistical tests, this estimate is also based on certain assumptions. The equation that gives Kaplan–Meier estimate is as follows.

[INLINE:1]

For any of the t time periods, S (ti) is the estimated survival probability. Here, dj is the dead patients and nj is the at-risk patients during the time period ti.

Concept behind Kaplan–Meier Test

Before describing further, it would be important to understand the concept of censoring first. All those participants who are lost to follow-up or they drop out of the study or if the study ends before they die or have an outcome of interest come under the ambit of censored cases. It will be important to remember that the results may be biased if the dropout is related to both outcome and treatment.

For each interval, survival probability is calculated as patients surviving divided by patients at risk. The denominator does not include “censored” participants. The data of the participants are stored using dates and time. The probability of surviving to any point is estimated from cumulative probability of surviving each of the preceding time intervals, that is, calculated as the product of preceding probabilities.[2] Although the probability calculated at any given interval is not very accurate because of the small number of events, the overall probability of surviving to each point is more accurate. It is based on estimating conditional probabilities at each time point the event occurs. Plotting confidence intervals can be useful in visualizing the differences in survival curves. The involved mathematical computations are beyond the scope of this article.

Interpretation of Kaplan–Meier Curves

The lengths of the horizontal lines along the X-axis of serial times represent the survival duration for that interval. Vertical axis represents estimated probability of survival. Precision of estimates depends on the number of observations. Survival estimates can be unreliable toward the end of a study when there are small numbers of subjects at risk of having an event.

Nonetheless, it is important to note that this test does not control for covariates and requires categorical predictors. It also cannot accommodate for time-dependent variables.[3],[4],[5],[6],[7]

Understanding Through Examples

The description can be best illustrated by an example. Consider a hypothetical longitudinal research study involving 10 end-stage heart failure patients who were followed up for 6 months to identify how many survived for the 1st month, 2 months, and so forth. Ideally, data of all ten patients should be available at the end of the study, but this may not hold true in practice. The estimation of survival probabilities may differ at every follow-up depending on the number of participants retained in the study at that point.

As shown in [Figure 1], of the 10 participants, one died after 2 months of initiation of the study and another 2 participants dropped out at the end of 4 months. By the concept of censored cases, the probability of surviving up to the 1st month is 100% = 10/10, but fraction surviving beyond 2 months is 9/10. Similar calculations can be done for each month, but these would be cumulative. There are many online free to access websites which provide calculators for Kaplan–Meier estimate. One of the websites is VassarStats, a website for statistical computation (http://vassarstats.net).[8] Quoting the same example as cited above and illustrating in VassarStats; on the website, under the section of “Clinical Research Calculators,” click on “Kaplan–Meier Survival Probability Estimates.” Then, a prompt appears on the screen regarding time period. Fill in the time intervals/endpoints according to the study and data related to participants consequently. This will provide calculated survival probabilities with confidence intervals as shown in [Figure 2].{Figure 1}{Figure 2}

Source: Table template from VassarStats website.

There is another calculator “MedCalc” with a free trial version (https://www.medcalc.org/manual/kaplan-meier.php).[9] Consider the same aforesaid example (arbitrary data) but with two patient groups 1 and 2. [Figure 3] shows the data of 10 participants distributed across 2 groups of comparison. The data are entered as patient group in column 1, survival time period in column 2, and the last column for censored cases (marked 0 for censored else 1 for those who reached the endpoint). Time period for the study was 6 months. Upon entering data, go to survival analysis and under that Kaplan–Meier test. There would be a prompt for assigning columns under survival time, endpoint, and factor. This will produce a survival curve along with other details [Figure 4]. Detailed process is given on the website of MedCalc.{Figure 3}{Figure 4}

Conclusion

The Kaplan–Meier test is a descriptive nonparametric estimate of the survival function, which takes all observations including failures and censored into consideration. It is commonly used to describe survivorship of study population and frequently used to compare two study populations through graphical presentation.