If you don't remember your password, you can reset it by entering your email address and clicking the Reset Password button. You will then receive an email that contains a secure link for resetting your password
If the address matches a valid account an email will be sent to __email__ with instructions for resetting your password
Clinical practice benefits from research to inform good decision making. Evidence-based medicine (EBM) helps physicians integrate experience and individual expertise with the best evidence. Various philosophical concepts, including “primum non nocere,” are balanced to achieve this. The tools of EBM, such as number needed to treat, are easy to calculate and to use. Other valuable tools include number needed to harm, attributable risk, and likelihood of being helped or harmed. It is also important to distinguish between relative risk and absolute risk to avoid drawing the wrong conclusions. With the right teaching techniques to grab attention and encourage active participation, real examples can be used to impart practical skills that the clinician can employ in translating research findings into something that helps the individual patient.
Which came first, the doctor or the researcher? Why did the first early hominid give a root extract to another early human? Was it based on previous observations or just a hunch that it would help the ailment? If based on observations, were they made carefully, or were they biased by unnoticed effects of the other substances that the sick person consumed at the same time?
The point is that research and clinical practice have always been siblings and have always needed each other. Sometimes they disagree, and sometimes one believes itself to be nobler than the other, but they are inextricably linked and always will be.
Clinical practice requires research to inform good medical decision making. Casual observations may have been the basis of early medicine, and are still important for individual patient care, but more than observations are needed. Perhaps, for a particular patient, 1 drug works but another does not, although both drugs have been studied in clinical trials and both have been proven more effective than placebo. Clinicians need to be able to understand more about what helps make the best choices for individual patients.
I had the good fortune to work in full-time clinical practice as a psychiatrist for many years before transitioning into the world of clinical research. My interest in research was piqued by observations I made during routine clinical care. I noticed that more patients seemed to have a better response to 1 drug than another, and this started to influence the way I prescribed. When I started to make more systematic observations, my confidence grew, and more of my patients had good responses.
When I first heard of evidence-based medicine (EBM), I scoffed at the idea, and I often proclaimed the paradox, “What is the evidence that evidence-based medicine is better than experience-based medicine?” Looking back, I now realize that I was practicing EBM even as I questioned its validity. However, I was not using the best tools or techniques. I also suspect I did not buy into the overall concept because of the way it was taught to me. The process was thoroughly dull and uninteresting.
Later in my career, I started to teach medical students and other physicians, and found ways to make learning more interesting. I found that the key to knowledge retention is to make learning easy and engaging, and to give opportunities to practice what has been taught. So, the purpose of this article is to give my perspectives about how to teach some key aspects of the process and tools of EBM.
Defining Evidence-Based Medicine: The What, Why, and How
So what is EBM? According to David Sackett, the father of EBM, it is about integrating individual clinical expertise with the best external evidence.
However, that is only part of it. Because clinical trials use idealized patients, the results are not always generalizable to patients in the real world. It takes human clinical expertise to be able to assess and integrate all of the evidence and apply it to the particular patient in the office today.
Why should we use EBM? In short, it is a wise way to make better clinical decisions. This does not mean that we are making poor decisions now. They may be ok, adequate most of the time, but sometimes they may be incorrect, based on misinformation. Decisions ought to be informed by the best available data. Hence, EBM's mission is to inform clinical practice with actual patients.
Using evidence in medicine is very much like using evidence in court. The evidence is presented, reviewed, evaluated with a healthy dose of skepticism, and used if it is satisfactory in quality. Multiple pieces of evidence may be used to build up the same case. There may be cross examination by different parties, challenging certain aspects of the evidence. In the end, a judgment is made, and a clinical decision is executed, unique to the circumstances of the particular case at hand.
There are many kinds of evidence. Typically, the quality of medical evidence has been thought of as a hierarchy,
with the randomized controlled trial (RCT) on the top of the pyramid. There are several published variations of this pyramid, and sometimes the RCT is in second place with meta-analysis at the apex (Figure 1). However, it is critically important in the true practice of EBM that we not forget to incorporate the evidence of what is already known for the particular patient we are treating at this moment, as well as their preferences. For example, suppose clinical trial evidence suggests the best treatment for patients with disease A, severity B, who have no other disease, are on no other concomitant medications, and are between the ages of 18 and 65 years, is drug X. But what do you do if your patient has disease A, severity C, has already failed an adequate trial of drug X, is on concomitant medications, and is 75 years old? This is precisely the kind of situation where we must integrate individual experience with some kind of extrapolation based on the clinical trial evidence. Furthermore, the “N of 1” study may be of greater and more direct clinical relevance to your patient than the clinical trial.
Fortunately, EBM allows common sense to prevail. Parachutes have been widely accepted as a preventative measure against the possibility of death for people who like to jump out of airplanes. However, this has never been subjected to an RCT.
It is generally accepted that observational data are sufficient to establish the effectiveness of the intervention over the control state (jumping with no parachute), although the authors still suggest, perhaps tongue-in-cheek, that an RCT is needed.
Of course, EBM is not merely the strict application of guidelines to force the same treatment algorithm on all patients, nor is it cookbook medicine.
It does not ignore the history or the preferences of the patient. It is not just an expression of study results as odds ratios with 95% CIs, or number needed to treat (NNT) or number needed to harm (NNH). These are common misconceptions. Such tools of EBM are only used in the service of figuring out the best treatment for the patient.
Medical Philosophy and Evidence-Based Medicine
Where does philosophy enter into this discussion? Like science in general, EBM is driven, on the one hand, by skepticism (being skeptical of the evidence), and on the other by the desire to help others (the moral concept of doing good). As physicians trained in the scientific model, we have to question the data we see and go through the process of ruling out various hypotheses to get closer to the truth. Of course, this also means that we can never be sure of the truth, which in this case is the best treatment. We can only work with probabilities and approximations. We could also say that we are trying to strike a balance between 2 philosophies: utilitarianism (doing the greatest good for the greatest number of people) versus deontology (patient-centered care).
Another relevant philosophy here is “primum non nocere”—Above All, Do No Harm—attributed to Hippocrates. We presume that by not rushing to change practice based on the latest study and maintaining current treatment standards, that we are avoiding harming the patient. However, the opposite is also possible: by not implementing effective treatments—by exercising skepticism and waiting for more evidence—we can harm our patients.
The tools that EBM uses, which involve comparisons of numbers and statistics, may be intimidating for some psychiatrists. Understanding such concepts is a lot different than talking to patients and doing psychotherapy. Fortunately most of the concepts are simpler than they look. The key to teaching EBM is to keep it simple and make it fun. Like anything else worth learning, there are various levels of depth that can be taught, but it is best to save the details for after the student knows enough to be intrigued and wants to learn more.
We have already established that there is a hierarchy of evidence to consider before making a clinical decision. One of the best forms of evidence is the RCT, which maximizes objectivity and minimizes bias through strict adherence to prespecified conditions. Suppose treatment A was shown to be superior to treatment B in an RCT for the outcome of time to all-cause treatment discontinuation. This is a common outcome in present day psychiatric studies and is essentially a proxy for treatment effectiveness. When all things are considered, including adverse events and efficacy, the longer a patient is willing to continue taking the treatment, the more likely the treatment is to have maximal opportunity to provide the desired benefit. As stated by former United States Surgeon General C. Everett Koop, “Drugs don't work in people who don't take them.”
A basic concept in clinical trials is probability. Because of many uncontrollable variables, such as the participation of different patients, no two studies can be expected to deliver identical results. Thus, we have to deal with probabilities of detecting a real difference between treatments if one exists, and of not finding a difference in the study if there really is no difference. By convention, we usually accept a probability of a detected difference being real, if we are 95% sure. This corresponds to a “P value” of less than 0.05, written as (P < 0.05). The smaller the P value, the smaller the uncertainty. However, because this refers only to the degree of uncertainty in our measurements, it does not tell us anything about the clinical importance of the difference. For that, we must consider effect size.
In the same example, let us suppose that treatment A was statistically superior to treatment B, having a P value of < 0.001. That sounds pretty impressive, but what matters more than the statistical significance is the clinical significance, or effect size. It is easy to get a tiny P value by studying large numbers of patients, but that does not guarantee a large effect size or impact on patients. A large effect size is desirable because it means many patients benefitted from that treatment choice compared with the alternatives tested.
There are a number of different ways to measure effect size, and one of the easier ways to understand is NNT. The NNT calculation answers the question, “How many people do we have to treat with treatment A instead of treatment B in order to expect 1 more case of the specified desired outcome?” Most clinical epidemiologists consider NNT to be the least misleading and most clinically useful measure of treatment effectiveness.
For the NNT measurement, small numbers are good because it takes fewer patients to provide a beneficial difference.
Is Everyone Paying Attention? The Value of Shock
When teaching a new concept, it is first necessary to make sure the audience is paying attention. This is easily accomplished by some catchy photos, cartoons, or headlines. A favorite headline is “Medicine's Secret Stat”, from Time magazine.
If Time thinks NNT is important, then hopefully the audience will, too. The tabloids have long ago figured out that we can often catch the reader's attention with words like “secret,” or a photo such as shown in Figure 2.
I find it helpful to start by talking about risk, because risk-related concepts are a strong focus of the media these days, and most of us can relate to it in one way or other. Risk-related images work for the media, and they also work to engage an audience, plus it is appropriate in a presentation about EBM because it is all about rationally evaluating benefit and risk. I like to show an image of an irrational risk (Figure 3 and Table) in conjunction with framing more rational assessment of risk using NNT and NNH.
TableFraming a rational assessment of risk: In 2001, the number of people who died in the Americas due to various causes.
Next, I will outline why the concept is important, explain what the terminology means, and how NNT and NNH are calculated, in layman's language, as much as possible.
I have also found it is helpful to use real-world examples that are meaningful to the audience. First, I will present the formula and explain the calculation. Then, I hand out copies of a study publication and ask the audience to make their own calculations. For this purpose, it is often helpful to set up the session as a workshop, so students know there will be some expectation of hands-on learning, as opposed to passive listening in a didactic format. (This expectation seems to help the audience pay better attention.) Calculations are available on a separate slide that remains hidden and are only revealed after the audience has had time to work them out on their own.
Here is an actual example of what is meant by NNT, from a publication about antipsychotic treatment of schizophrenia
: the NNT for relapse prevention in schizophrenia, for haloperidol decanoate versus placebo is 2. This means it requires only 2 patients taking haloperidol decanoate (the experimental treatment) to prevent one more relapse (over the 48-week study) than would have been expected if placebo (the control treatment) had been used.
Generally in psychiatry, single-digit NNTs tell us that the difference is clinically important, and the smaller the number, the better. In other fields, however, larger NNTs such as 50 may still be clinically important; for example, if the outcome of interest is prevention of death.
Theoretically, the smallest possible NNT is 1, meaning the experimental treatment helps everyone, and the control treatment helps nobody. However, in practice, there are always going to be exceptions and special cases where the treatment does not work, which will give you an NNT slightly >1, which gets rounded up to 2. By convention, NNT and NNH are normally rounded up to the next greater magnitude integer, because you cannot treat a fraction of a patient. If we apply these principles to the example from the previously noted study about parachutes by Smith and Pell,
Effect size also comes into the understanding of estimating harmful outcomes, by using the concept of NNH. This answers the question, “How many people do we have to treat with treatment A instead of treatment B in order to expect 1 more case of the specified harmful outcome?” The NNT and NNH are often presented together because it is hard to conceive of a benefit that does not have any associated risk of harm. What is different, however, is that in the case of NNH, a larger number is better: the larger the number, the more patients would be needed to be treated before an additional harmful outcome is encountered.
How the outcome is defined is important because it determines whether you will describe an NNT or an NNH. For example, hospitalization is typically considered an adverse event in a clinical trial, but avoidance of hospitalization is considered a beneficial outcome. An NNH would be calculated for hospitalization, but an NNT could just as easily be calculated for avoidance of hospitalization.
The NNT and NNH calculations start with attributable risk (AR). The AR is simply the event rate, or frequency of the outcome, in the control group (CER) minus the event rate in the experimental group (EER):
Then, NNT or NNH is the reciprocal of the AR. Whether we are talking about an NNT or an NNH depends on the outcome chosen, as previously mentioned. In perusing the literature, you might come across the terms absolute risk reduction (ARR) and absolute risk increase (ARI). The ARR is another term for AR when calculating NNT, and ARI is another word for AR when calculating NNH:
When the NNT gives you a negative number for a result, it is actually an NNH, and when the NNH is a negative number, it is actually an NNT. Although normally the nature of the outcome determines whether we are describing an NNT or an NNH, sometimes these values are reported in tables for convenience, for example, when there are several comparators. Occasionally then, a table that is titled NNT may contain a negative number that is actually an NNH, in which the experimental treatment is actually worse than the control treatment.
This brings us to the distinction between relative risk and absolute risk. Outcomes are often reported as relative risk reductions or relative risk increases. However, for these purposes, relative risk is of less interest than absolute risk, because it is less relevant to the patient. There may be a large magnitude difference in relative risk, which translates to an almost meaningless difference to patients.
It is most effective to illustrate these principles in the context of practical applications, as in the following example: the risk of developing breast cancer in the next 5 years for women ages 55 to 65 years, who do not take aspirin (ASA), is 20 of 1000 or 2%.
This would be absolute risk. In other words, women of ages 55 to 65, who do not take ASA, have a 98% chance of remaining free of breast cancer over the next 5 years.
However, if these women take daily ASA, the risk declines by 20% to 16 of 1000. This represents a relative risk reduction of 20%, which now results in an absolute risk of 1.6%. This means that women who take ASA daily have a 98.4% chance of being cancer free over 5 years. You could also look at this in terms of overall relative risk, which would be 1.6% of 2%, or 0.8. A relative risk of 0.8, which is less than 1, indicates that the risk in the group taking ASA is less than that of the group not taking ASA. Therefore, ASA has a protective effect against developing breast cancer over the next 5 years, which sounds fairly significant.
However, when we look at the ARR, which is calculated by taking the absolute risk of 2% in the non-ASA treatment group minus the absolute risk of 1.6% in the ASA treatment group, you get 0.4%, which really does not sound as significant as originally thought. So women of ages 55 to 65 years who take ASA decrease their risk of developing breast cancer in the next 5 years by only 0.4%. This is not a big change at all. This small potential benefit would need to be balanced against the small potential for adverse events associated with long-term ASA therapy.
We can translate this into an NNT by using the formula 1/AR, where AR is attributable risk: 1/0.004 = 250. Thus, you would expect to treat 250 women with ASA before you could prevent 1 case of breast cancer that might have occurred over 5 years with no ASA. EBM would suggest that whether this compels the use of ASA is a matter of individual value.
The NNT and NNH should always be presented with 95% CIs,
The NNT to prevent 1 all-cause discontinuation by using clozapine instead of quetiapine, following discontinuation of another drug, is 3 (95% CI 2–6). This means that there is a 95% probability that the NNT falls between 2 and 6. Alternately stated, if we repeated the experiment 100 times, the process by which the CI is created will capture the true parameter 95% of the time. The more narrow this CI, the better, because we are more sure of the precision of the result. As the CI widens, we have more uncertainty.
The 95% CI of an NNT and of an AR are not quite the same, and this is a bit tricky to explain. It is easier to understand that when the 95% CI of an AR includes 0, it is nonsignificant.
So, if 0 is included in that 95% CI, it is converted to 1/0 or infinity, in the NNT. Thus, a nonsignificant NNT will appear to be discontinuous. We can tell at a glance if an NNT is nonsignificant because the 95% CI will contain both positive and negative numbers. Remember that when there is no treatment effect, the AR is 0, and the NNT is infinite,
This is a convenient way of figuring out the balance between benefit and risk. Although it does not allow us to compare all potential benefits to all risks simultaneously, it does help us compare the likelihood of encountering a particular benefit with the likelihood of a particular risk. It is especially useful to compare the most desired benefit against the most important adverse event to avoid. We have to bear in mind that these benefits and risks may vary from patient to patient. For example, 1 patient may be more concerned about avoiding insomnia, whereas someone else may be more worried about weight gain, and someone else may be most worried about diabetes. On the benefit side, most people would agree that avoiding hospitalization is a priority, but may not be so thrilled by the prospect of longer treatment continuation over a year.
The LHH calculation is done by dividing the NNH to account for the harm by the NNT to account for the benefit. When the number is >1, the benefit is more likely to occur than the harm, all other things being equal. An example of calculating LHH can be done using published data from an NNT/NNH analysis of the use of adjunctive antipsychotics for major depressive disorder.
In this analysis, the NNT for remission by adding quetiapine 150 mg to an antidepressant, versus antidepressant monotherapy, is 11 (95% CI, 6–48). The NNH for discontinuation due to an adverse event, comparing the same 2 treatments, is 16 (95% CI, 10–33). The LHH is 16/11 or 1.5. Therefore, is it is slightly more likely to expect the benefit of remission than treatment discontinuation by using adjunctive quetiapine instead of antidepressant monotherapy.
There are some inherent limitations of EBM that must be considered. The tools used are only as good as the quality of the underlying evidence. For example, if a study did not control a variable well, for example, fasting status before drawing a blood glucose sample, then the NNH derived from those data will not be valid. Also, if there is measurement error, this will also result in bias in NNT calculations.
Also, it is not good to look at only 1 variable in isolation. Evidence should evaluate both risks and benefits, and they should be relevant to the patient being treated, while considering the patient's individual values and preferences. A patient may personally value maintenance of good erectile function over avoidance of weight gain, or may prefer to avoid hospital admission over the potential of improved cognitive function.
Techniques such as NNT evaluations are most valid to calculate from an RCT with identical conditions for all drugs under study. Results of NNT and NNH are only calculable for binary or dichotomous events that are either present or absent, and do not apply to continuous variables such as the value of a blood test. Continuous variables with clinically significant thresholds, such as weight gain ≥7%, can be expressed as an NNT because then they are binary. However, some statisticians argue philosophically against doing this because data points are lost in the conversion to being above or below some threshold.
EBM, and its tools, although often misunderstood, can be used to enlighten us, help us make sense of scientific data, and help physicians make better treatment decisions in collaboration with their patients. It can also better inform policy decisions by payers and health plan administrators. It is possible to teach and learn EBM in ways that are engaging and interesting. Although it comes with limitations, EBM and the tools it uses can help translate research results into better clinical practice and hopefully better outcomes for patients.
Conflicts of Interest
The author is an employee and shareholder of Eli Lilly and Company. There was no sponsor role of Eli Lilly and Company in any of the studies described in this manuscript. The manuscript was conceived and written entirely by the author, and was approved by Eli Lilly and Company.
The author gratefully acknowledges the editing advice of Angela C. Lorio, ELS of i3, part of the Inventiv Health Company, and the statistical advice of Michael Case, Lilly USA.
Evidence-based medicine: what it is and what it isn't (editorial).