REVIEWS Year : 2019  Volume : 5  Issue : 3  Page : 136141 Application of Bayesian Analysis in Medical Diagnosis Vivek Verma^{1}, Ashwani Kumar Mishra^{2}, Rajiv Narang^{3}, ^{1} Department of Neurology, All India Institute of Medical Sciences, New Delhi, India ^{2} National Drug Dependence Treatment Centre, All India Institute of Medical Sciences, New Delhi, India ^{3} Department of Cardiology, All India Institute of Medical Sciences, New Delhi, India Correspondence Address: In this work, we outlined the application of the Bayesian technique for integrating the results of multiple tests while treating any disease. We provided an overview of the fundamental concept of Bayesian analysis in making an inference about a phenomenon using a concordance between available information and prior knowledge. An attempt is being made to demonstrate the applicability using core aspects, viz., nature of probability, parameters, and inferential procedure used to draw inference about the population characteristic under both paradigms. Here, we tried to sketch the underlying steps and assumptions regarding use of the Bayes theorem and the analytical techniques that can be used to analyze and interpret medical data while treating any disease.
Introduction In medical treatments, clinicians and nurses very often have to make various complex and critical decisions during the diagnosis of the patients. In reality, these decisions are full of uncertainty and unpredictability. However, based on the available information, obtained from various clinical and diagnostic tests and situation of the patient, both clinicians and nurses try to reduce their uncertainty in clinical decisions and attempts to shift to the predictability of the chance of improvement in patient's condition. Here, best estimates are obtained using the available information on clinical and diagnostic tests, and the situation of the patient acted as light of evidence while making a decision. For the clinicians, it is of more concerned that they are capable enough in understanding and interpreting the importance of prediction probability of particular outcomes based on the recommended clinical and diagnostic tests. The scenario can be visualized through Example 1, where a researcher is interested in determining the association of smoking as a risk factor for the development of myocardial infarction [Table 1].{Table 1} Example 1 Myocardial infarction is one of the most leading causes of death worldwide from coronary artery disease (CAD). Existing literature has shown that risk for myocardial infarction is higher among smokers compared to nonsmokers. To determine the association of smoking as risk factor for development of myocardial infarction, a medical researcher conducted a case–control study on 100 patients, where 50 patients with myocardial infarction within 1 year on onset and 50 controls were recruited from a tertiary care hospital [Table 1]. Suppose p1 is the proportion of patients in control group affected from myocardial infarction and p2 as the same for group exposed. He hypothesized his perception under null as the occurrence of myocardial infarction is independent of smoking habit against onesided alternative as myocardial infarction is higher among smokers than nonsmokers. [ININE:1] In the control group, 0 myocardial infarction cases are found; in the exposed group, there are three. As P = 0.1212, based on onesided Fisher's exact test, is greater than 0.05 significance level, we do not reject the null hypothesis that the smoking habit is independent of myocardial infarction. Clinical decisions are based on the probability theory because it acts as a key element required to establish a process of reasoning.[1] The challenges that occurred to clinicians and nurses are in estimating the probability based on the process and its related quantification by relating the proposition obtained from the available information with the values of probabilities.[2] In medical diagnosis, the probability defined as the “degree of belief” or plausibility that supports the given proposition as the decision. The decisions are made using the prior experiences and support of available information in terms of various diagnostic and medical records. The situation can be understood practically, using Example 2 as an extension of Example 1. Example 2 (continuation of Example 1) A cardiologist may claim that three myocardial infarction cases in 50 patients are definitely in cardiological aspects being significant. This belief may be based on the information from another study in the same department in the previous year in which myocardial infarction has never shown up in control patients. If there were 400 historical controls in addition to the 50 concurrent controls, none with myocardial infarction, then the contingency table takes the following form [Table 2].{Table 2} Now, the onesided P value = 0.0009, based on onesided Fisher's exact test, is less than 0.05 significance level, we reject the null hypothesis that the smoking habit is independent of the myocardial infarction, which makes both the statistical and cardiological importance more compatible. In one context, Example 2 as compared to Example 1, we have shown that the decision based on frequentist principle may change from acceptance to rejection, whereas the other aspect which suggests incorporation of existing information along with the obtained information will help to achieve a more realistic conclusion. In the frequentist paradigm, it is generally inappropriate[3],[4] to gather historical and concurrent information and to quantify one's prior experiences and learning's in light of available information. In frequentist statistics, the probability of occurrence of an event is defined in the long run of the experiment (i.e., the experiment is repeated under the same conditions to obtain the outcome). Example 3 Let us consider that a coin is tossed and our interest lies in estimation of the fairness of the coin, as presented in [Table 3]. To check the unbiased behavior of the coin, the experiment is theoretically repeated infinite number of times but practically done with a stopping intention.{Table 3} As we know that probability of getting a head on tossing a fair coin is 0.5, then the number of heads will represent the actual number of heads obtained. Here, difference denotes the difference between 0.5 times (number of tosses) − number of heads obtained [Table 3]. An important thing is to note that, though the difference between the actual number of heads and expected number of heads (50% of number of tosses) increases as the number of tosses are increased, the proportion of number of heads to total number of tosses approaches to 0.5 (for a fair coin). However, based on the frequentist principle, it is not possible to estimate or update the unbiasedness of the coin using the limited tosses, and theoretically repeating the experiment infinite number of times is practically impossible. To incorporate the prior experience with the available information, statisticians tried for alternative procedures that will update their knowledge based on the available evidence so that decisions became more effective and certain. The Bayesian method was an outcome of such an effort. This procedure was introduced by Thomas Bayes and Laplace in the 18th century and developed by statisticians and philosophers in the 20th century. Bayesian approaches were considered as the mathematical manipulation of uncertainty as it provides a way of reasoning coherently in the world around us, in the face of uncertainty,[5] where probabilities are quantified by the plausibility or truth of a suggestion that is conditional on the available or gathered information.[1] The probability interpretation of an event can be understood in two ways: first, through direct ways (frequentist), where we know the previous circumstances and required the probability of an event, and second, through unconventional ways by using the inverse method (Bayesian), where we know the event which has happened and requires the probability which is resulted from any particular set of circumstances under which it might have happened. Example 4 The probability of a certain medical test being positive is 90% if a patient has disease. 1% of the population has the disease, and the test records a false positive 5% of the time. If we receive a positive test, what is our probability of having disease? According to the frequentist way of defining, the probability of having disease is 1%, as either the person has the disease or not. On the other hand, the Bayesian suggests that there is a prior probability of 1% that the person has the disease. This probability should be updated in light of the new data as [INLINE:2] we can obtain the probability that the person, whose test was positive and also has the disease, i.e., P(Disease  Test+) as [INLINE:3] [INLINE:4] =0.15. The result suggests that probability that the person has the disease given that the test was found to be positive is 15%. Bayes theorem has used widely to draw inferences in various clinical trials[3],[6] and healthcare evaluation.[7] Among various applications of Bayes' theorem has also used for outcomes' analysis and assessments such as cardiac arrest,[8] stress testing,[9] estimation of cardiac biventricular volumes,[10] perioperative cardiac risk assessment,[11] percutaneous coronary intervention strategies in cardiogenic shock,[12] and evaluation of the noninferiority of transcatheter aortic valve replacement as compared with surgical valve replacement[13] and its application in stress electrocardiography, thallium scintigraphy, technetium bloodpool scintigraphy, and cardiac fluoroscopy.[14] The present work outlined an overview of methodological aspects of the Bayesian technique and compared the classical or frequentist approach. Section 2 presented a summary of introductory background about the definition of probability based on the Bayes theorem. The differences in the Bayesian and frequentist approaches are presented in Section 3. This section discussed how Bayesian and frequentist methods define probability, estimation procedures, and performance evaluation. Fundamentals Of Bayesian Methodology There is no fundamental distinction between unobserved events and unknown parameter of interest in the Bayesian paradigm and are defined using the available quantities and information required to describe probability of occurrence of that event. In, the existing information available before observing data is called prior distribution, and after observing the data, the update in the information due to additional information of data is called posterior distribution. The distributional assumptions that lead to obtaining inferences in Bayesian procedure are as follows:[15],[16] A specific parametric form is essential to describe the distribution of the observations given the parameter value, which is commonly known as a likelihood functionSince the parameter is assumed as an unknown quantity rather than being fixed as of classical inference, the prior distribution on the parameter of interest unconditional on the data must be given. The essential steps that usually followed to draw inferences about the characteristic of the unknown parameter under the Bayesian thinking are: Specify a probability model, which contains some prior information about the unknown parameterUpdate information about the unknown parameter through conditioning their probability model given the observed data. Here, the data values are assumed to be exchangeable, i.e., the model will not be changed by reordering the data value. This exchangeability assumption assures that the data generation process is conditional on the unknown model parameter and will be the same for every generated dataEvaluate the performance of the model to the available or observed data and sensitivity of the conclusion to the assumption about the model. Methodologically, Bayesian modeling and Bayesian data analysis are approaches of incorporation of prior information and utilization of observed data to draw inference about the parameters. Here, data are stored as a likelihood function, which describes the strength of support obtained from the observations for the various possible values of the parameter. In the Bayesian paradigm, the prior information about the unknown parameter of the statistical model plays a very crucial role, which needs to be identified and expressed in the form of a prior distribution. Generally, prior distributions are classified as either informative prior, where prior information is genuine and provides the best explanations of its strength and relation to the observed data, or noninformative prior that are also known as improper or weak prior, where the credibility of prior information is lacking and is defined based on the belief before obtaining the new data. This information obtained from the prior and the data are then synthesized to produce posterior distribution, which expresses the enhancement in the knowledge about the parameter after obtaining the data. Bayesian methodology can be mathematically formulated as, if D denotes the data point and θ some model parameter, [INLINE:5] Where P(θ) is the prior probability of θ before observing any information about D, P (Dθ) is the likelihood function which denotes the probability of observing D conditioned on θ, and P(θD) is the posterior probability of θ after observing D. Methodological Differences in Paradoxes of Frequentist and Bayesian Paradigm One of the main objectives of medical research is to infer a population phenomenon based on the information available in terms of some data from it. For exploring more information about the behavior and pattern of changes in the population characteristic(s) using the available or obtained sample, in addition to classical summaries, a specific branch of statistics deals with that is called the inferential statistics. In inferential statistical, we expand our horizon beyond the classical statistics with various techniques of estimation of parameters, comparison of their closeness, and changes with population value. There are two different philosophical aspects of inferential statistics, viz., classical or frequentist inference and Bayesian inference. For clarity under the Bayesian approach, we first need to understand the basic differences between Bayesian and frequentist inference. This section addressed the nature of probability, parameters, and inferences under the two approaches. In the Bayesian paradigm parameter of interest, θ, is assumed as an unknown quantity rather than being fixed as of frequentist methodology. The core difference between Bayesian and frequentist methodologies in contrast to the estimation of the parameter of interest says θ. In the present section, we highlighted some of the important aspects, viz., nature of probability, parameters, and inferential procedure used to draw inference about the population characteristic under both Bayesian and frequentist based on the obtained samples. The core differences are as follows: Interpretation of probability Frequentist defined probability as the number of times an event observed (D) when an infinite series of the trial has performed under identical situation and condition and usually symbolized as P(Dθ). However, in Bayesian paradigm,[2] probability is defined as degree of belief of the observer before or after the data observed. The definition of probability as stated in Bayes theorem of equation (1) is subjective and not the same as objectively (frequentist) defined as “the relative frequency of an occurrence of the event from a large number of repeated trials.” The frequentist definition raised some interesting epistemological issues regarding probability theory, such as how one can define the probability density function of a parameter on the basis a single data set. These questions can easily address using Bayesian interpretation of probability. The basic difference between frequentist and Bayesian methodology was that frequentist defines probability as the frequency of repeated occurrence of an event (perhaps hypothetical), whereas for Bayesians, probability is defined as the degree of certainty of the occurrence of an event. Broadly speaking, for frequentist model, parameters are fixed quantities and data are random; on the other hand, for Bayesians model, parameter(s) is a random quantity and data are considered as fixed. Example 5 To detect the CAD (D), usually cardiac fluoroscopy (T) test is opted, where the number of major coronary arteries (left anterior descending, circumflex, and right coronary artery) was reviewed in the routine heart's fluoroscopy to trace the deposition of radiodense[14] in the heart. The available information, which was obtained in terms of sensitivity, (P[T+D+]), specificity, (P[T−D−]), and prior information (pretest likelihood) such as chance of occurrence of CAD, P(D+), under Bayesian paradigm, was utilized to calculate the posttest likelihood for the occurrence of CAD as. [INLINE:6] The posttest likelihood or posterior probabilities provide more information than that obtained in pretest likelihood and available information. Fixed and variable quantities In frequentist methodology, the obtained observations have considered as an independent and identically distributed random sample from a population, where the population parameters are unknown but fixed by nature. On the other hand, in Bayesian analysis, the observed data are considered as fixed quantity or realization of the population characteristic. Here also, the parameters are unknown but random quantities and described distributionally. Summary of the estimators In frequentist approach, point estimates and their standard errors, along with the 95% confidence interval, describe not only the characteristics of the estimator but also the interval that covers the true parameter value 95% of times, on an average. In the Bayesian analysis, the inference about the parameter is usually obtained through their posterior means and quantiles. The highest probability density interval indicates the region of highest posterior probability of occurrence of the parameter. The Bayesian credible interval provides a probabilistic bounded region for the parameter value. However, the frequentist confidence interval describes a probability about the bounds given fixed parameter value. Example 6 To estimate the incidence of CAD in people without any heart disease at baseline but having snorers with obstructive sleep apnea and snorers without obstructive sleep apnea, a study was conducted[17] on a total of 308 (245 males and 63 females) individuals reported in middleaged males during a followup period of 7 years. Let us consider that the occurrence of any CAD among those having (a) snorers with obstructive sleep apnea and (b) snorers without obstructive sleep apnea, is a Binomial random variable, and the incidence rate has assumed to follow beta (a, b), for the present study: a = 4 and b = 22. The impact on change in the posterior distribution (betabinomial) for the proportion of people who suffered from CAD due to different prior choices for the given data is presented in [Table 4] and presented in [Figure 1].{Table 4}{Figure 1} Quality checking In the frequentist paradigm, the comparison between groups was generally carried out by defining the errors of Type I (level of known significance) and Type II, the size of the effect, and/or the power of the test procedure. The final inference is based on the P value.[18] In the case of Bayesian method,[19] the analytical procedure allows the formal incorporation of prior information. Here, the comparison between the groups is generally done by comparing the posterior predictive distributions,[20],[21] the posterior sensitivity is based on the prior forms, and the final inferences were made using the Bayes' factors.[6],[22],[23] The Bayes' factor (BF01) is the ratio of the posterior odds over the ratio of the prior odds. The rule of thumb[23] to infer is as follows, if log10(BF01) varies between 0 and 0.5, the evidence against null hypothesis H0 will be poor, if log10(BF01) lies between 0.5 and 1, it is substantial, if it is between 1 and 2, it is strong, and if it is above 2, it is decisive. To demonstrate the practical significance and applicability of the Bayesian method, we discussed through below given example of estrogen and progesterone receptors status of tumor assessed from 20 breast cancer patients. Example 7 Suppose the estrogen and progesterone receptors status of the tumor are assessed from 20 patients with locally advanced breast cancer patients. Here, we are trying to test whether the estrogen and progesterone receptors status of tumor are independent [Table 5].{Table 5} As the P = 0.05402, based on the Chisquare test, is greater than 0.05 significance level, we do not reject the null hypothesis that the status of progesterone and estrogen receptors is independent among breast cancer patients. Let E be the event that the tumor is estrogen receptorpositive and F be the event that it is progesterone receptorpositive. Then, [INLINE:7] [INLINE:8] [INLINE:9] The conditional probability that an individual progesterone receptor, given the tumor is estrogen receptorpositive, is [INLINE:10] However, the conditional probability that an individual progesterone receptor, given the tumor is estrogen receptornegative, is [INLINE:11] The ratio of posterior odds and the prior odds in favor of event estrogen receptor, E, is [INSIDE:1], which is known as the Bayes factor in favor of event E, is [INLINE:12] [INLINE:13] The value of logarithm of Bayes factor (1.68) suggests strong evidence of impact of estrogen receptor on the progesterone receptor among patients having breast cancer. In this introductory article, an attempt is being made to describe the functionality of the Bayesian technique that can be used for analyzing a wide variety of data for which prior information is available and that its incorporation minimizing uncertainty in making effective clinical decisions. Here, we described how the Bayes' theorem updates the priorly available belief and information in the form of posterior probability density function conditioning on the available observations. Such is the efficacy and strength of Bayesian techniques, which produce simple rules and a sophisticated analytical procedure for analysis of medical data. In addition, Bayesian techniques provided rules for scientific reasoning for testing the underlying hypothesis about the parameter in relation to the observed data and quantified the belief in terms of probability. Acknowledgment We would like to thank the anonymous referees and Prof. Dr. Sandeep Seth, Department of Cardiology, All India Institute of Medical Sciences, for their constructive comments and suggestions to improve the quality of this manuscript. Author's contributions All authors have made equal and substantial contributions to the work reported in the manuscript. Financial support and sponsorship Nil. Conflicts of interest There are no conflicts of interest. References


