Introduction
The Hype Cycle (
Figure 1) is a branded graphical presentation by the American research, advisory, and information technology firm Gartner. It provides a graphical and conceptual presentation of the 5 key phases of technology's life cycle.
1Gartner Research Methodologies
Although technological and scientific progress in pharmacovigilance may not map exactly to the Hype Cycle, we believe it is not a terrible fit either. From our perspective, the flurry of activity in the late 20th and early 21st centuries following the World Health Organization Uppsala Monitoring Center's nonproprietary innovation of Bayesian disproportionality analysis of spontaneous reports
2Practical aspects of signal detection in pharmacovigilance.
included inflated claims, overpromotion of statistical computation over clinical judgment, and aggressive promotion of proprietary software variations as objective solutions to that data.
3A decade of data mining and still counting.
Subsequent to identification of gaps in transparency/disclosure in the published pharmacovigilance data-mining literature,
4- Almenoff J.
- Dumouchel W.
- Kindman L.
- Yang X.
- Fram D.
Letter to the editor.
, a major international working group (the Council of International Organizations of Medical Sciences Working Group VIII) admonished readers about commercial and intellectual conflicts of interest in this field.
6Practical aspects of signal detection in pharmacovigilance.
Fortunately, with wider use, this hyperinflationary phase evolved into the more realistic view that proprietary software did not neutralize the limitations of spontaneous reporting system data.
7- Hauben M.
- Reich L.
- Gerrits C.
- Younus M.
Illusions of objectivity and a recommendation for reporting data mining results.
In the era of big data, predictive analytics, and data science, pharmacovigilance is increasingly multidisciplinary. Contributions from computer science, data science, network biology, operations research, chemoinformatics, and biomedical informatics are very welcome, infusing fresh and ingenious insights, datasets, and methods, and even elevating the visual aesthetic of publications with intricate and beautiful visual displays. Creating common data models to seamlessly integrate, interrogate, and analyze multiple claims and electronic health record (EHR) datasets for purposes of safety analysis (eg, Sentinel, Exploring and Understanding Adverse Drug Reactions by Integrative Mining of Clinical Records and Biomedical Knowledge, the Observational Health Data Science and Informatics collaborative) are already informatics tours de force.
8- Bate A.
- Reynolds R.F.
- Caubel P.
The hope, hype and reality of big data for pharmacovigilance.
Exploring and Understanding Adverse Drug Reactions by Integrative Mining of Clinical Records and Biomedical Knowledge goes further by providing a pipeline that integrates EHR data with bioinformatics data on proteins and pathways.
9- Oliviera J.L.
- Lopis P.
- Nunes T.
- et al.
The EU-ADR platform: delivering advanced pharmacovigilance tools.
Another integrative work in progress is the Large-scale Adverse Effects Related to Treatment Evidence Standardization, an open scalable system for linking disparate pharmacovigilance evidence sources with clinical data.
10Large-scale adverse effects related to treatment evidence standardization (LAERTES): an open scalable system for linking pharmacovigilance evidence sources with clinical data.
These are just a few of the recent exciting achievements in pharmacovigilance thanks in large part to data science.
Although training/education in these disciplines entails prerequisite courses in core statistical and/or biomedical areas achieving literacy, it does not typically include in-depth and nuanced clinical training or practical experience in the use and interpretation of varied pharmacovigilance data sources necessary for the complex design and analysis of large drug safety datasets. Basic statistical literacy and data availability promote increased deployment of analytics, often without commensurately increased consideration or understanding of clinical/scientific context in making statistical judgements. Consequently, we face a data deluge and corresponding efflorescence of research, whose increased scope and complexity present a greater challenge to peer review and that sometimes demonstrate a dismaying deficit of context, resulting in exorbitant conclusions possibly amplified by hyperbolic media coverage.
11To hype, or not to(o) hype.
, An extreme realization of these complications is the viewpoint expressed in the essay entitled “The End of Theory: The Data Deluge Makes the Scientific Method Obsolete“ by Anderson
13The end of theory: the data deluge makes the scientific method Obsolete.
: “with enough data, the numbers speak for themselves.” Consequently, there is no more need for scientific insights and judgment.
Understandably, an increasingly common admonition in pharmacovigilance is “[fill in the technology] does not replace clinical/scientific judgment.” What does this mean exactly? Unpointed admonitions may paradoxically exacerbate the situation via a false sense of security of having done something. Which judgments are missing and where—signal detection or evaluation? Should robust clinical context be required for publication? Postpublication review/commentary provides a potential control mechanism against premature conclusions and false alarms but may be underutilized.
14- Falavarjani K.G.
- Kashkouli M.B.
- Chams H.
Letter to Editor, a scientific forum for discussion.
, 15Inadequate post-publication review of medical research.
Machines and humans learn by example; thus, our aim is to hone the aforementioned admonition via noteworthy examples from the published literature, hopefully titrating the relative emphasis given to computational wizardry versus clinical and scientific judgment, and thereby foster more temperate language and conclusions. Expressed a little differently, we wish to mitigate drift in the direction of Anderson’s “The End of Theory: The Data Deluge Makes the Scientific Method Obsolete”
13The end of theory: the data deluge makes the scientific method Obsolete.
by discussing results and conclusions from the published literature that illustrate the potential hazards of letting the numbers speak for themselves.
The studies in our small convenience sample are valuable contributions with praiseworthy energy and creativity, but they also illustrate contextual gaps. They were selected because they tackle particularly challenging (ie, drug–drug interactions [DDIs]) or “hot” (eg, user-generated content) topics, for potential snowball effects from redundant publication and sensational media coverage, and/or because they contained elements that we considered emblematic of the aforementioned concerns. Our choices are not overall endorsement or condemnation of the research, which we do admire and of which we could say much more given sufficient space.
We consider through the lens of clinical, clinical pharmacologic, and, more broadly, pharmacovigilance judgment, claims of novelty (ie, heretofore unrecognized findings), validation (ie, confirmation of findings), and invisibility (ie, novelty not from lack of application of legacy methods but actual invisibility to legacy methods). We also make a few very brief statistical excursions.
Pharmacovigilance judgment is a deep and broad situational awareness achieved by extended and continuous hands-on involvement in pharmacovigilance in a public health practice—rather than a purely research—capacity. It entails a comprehensive awareness of the full suite of legacy methods and corresponding implementations, relationships of adverse events (AEs) on a conceptual level to their representation in controlled vocabularies, nosologies and coding systems, and limitations of references sets, a necessary framework for evaluating the aforementioned claims.
“Your first 10 words are more important than your next 10,000.” Elmer Wheeler ∗Elmer Wheeler (1903–1968), regarded by some as America’s greatest salesman, authored the book Tested Sentences That Sell and was also known as “Mr. Sizzle” for his quote “Don’t sell the steak, sell the sizzle.”
Data science papers often make dramatic claims, sometimes starting with the title, about developing breakthrough approaches uncovering novel safety concerns invisible to legacy methods and datasets. Readers outside of pharmacovigilance could be forgiven for concluding from titles such as “An Integrative Pipeline….,”
16- Lorberbaum T.
- Sampson K.J.
- Woosely R.L.
- et al.
An integrative data science pipeline to identify novel drug interactions that prolong the QT interval.
“Combining Spontaneous Reports and Electronic Health Records….,”
17- Harpaz R.
- Vilar S.
- Dumouchel W.
- et al.
Combing signals from spontaneous reports and electronic health records for detection of adverse drug reactions.
and “Coupling Data Mining and Laboratory Testing….”
18- Lorberbaum T.
- Sampson K.J.
- Chang J.B.
- et al.
Coupling data mining and laboratory experiments to discover drug interactions causing QT prolongation.
that combining multiple information sources is a disruptive paradigm in pharmacovigilance, termed “a novel strategy.”
18- Lorberbaum T.
- Sampson K.J.
- Chang J.B.
- et al.
Coupling data mining and laboratory experiments to discover drug interactions causing QT prolongation.
However, these titles merely recapitulate the established standard information integration framework in pharmacovigilance signal management. There is also usually a disproportionately small generic statement(s) along the lines of “Of course, these results need to be confirmed…”; these generic statements are often inconsistent with the jubilant title, tone, and/or content in the article. It is important to note, however, that data scientists may be ideally equipped to come to grips with the tantalizing goal of condensing and high-throughputting pharmacovigilance information integration.
Drug–Drug Interactions
Consider reports of machine learning discovering “latent” DDIs resulting in torsade des pointes (TdP)
16- Lorberbaum T.
- Sampson K.J.
- Woosely R.L.
- et al.
An integrative data science pipeline to identify novel drug interactions that prolong the QT interval.
appearing in bioRχiv,
19Lorberbaum T, Sampson KJ, Woosley RL, et al. BioRχ. Data science identifies novel drug interactions that prolong the QT-interval. doi. 10.1101/024745.
at least 2 peer-review publications,
16- Lorberbaum T.
- Sampson K.J.
- Woosely R.L.
- et al.
An integrative data science pipeline to identify novel drug interactions that prolong the QT interval.
, 17- Harpaz R.
- Vilar S.
- Dumouchel W.
- et al.
Combing signals from spontaneous reports and electronic health records for detection of adverse drug reactions.
and additional venues. These additional venues included professional newsletters,
20Identifying drug-drug interactions.
university news bulletins,
21Tatonetti lab connects drug interactions to deadly heart condition.
and the
Chicago Tribune (including a “detailed multimedia feature”).
22Big data offers new way to find hidden drug interactions.
, 23Scientists see progress in identifying deadly drug interactions.
, The
Chicago Tribune described it as a “…..unique collaboration with the
Chicago Tribune, which set out to do what had never been done before: search the vast universe of prescription medications to discover which combinations might trigger a potentially fatal heart arrhythmia.”
23Scientists see progress in identifying deadly drug interactions.
However, no Chicago tribune staff are listed as co-authors or acknowledged in the peer-reviewed publications.
16- Lorberbaum T.
- Sampson K.J.
- Woosely R.L.
- et al.
An integrative data science pipeline to identify novel drug interactions that prolong the QT interval.
, 18- Lorberbaum T.
- Sampson K.J.
- Chang J.B.
- et al.
Coupling data mining and laboratory experiments to discover drug interactions causing QT prolongation.
The study starts with a very smart idea that is particularly attractive in the setting of potential underreporting: apply machine learning to a subset of the US Food and Drug Administration’s Adverse Event Reporting System (FAERS) reports recording one drug to identify AE patterns (“AE fingerprint”) composed of more common epiphenomena discriminating torsadogenic from non-torsadogenic drugs as per 2 reference sets, Credible Meds and the Veterans Administration critical DDI list; the next step was to seek similar patterns in a subset of FAERS listing 2 drugs per report. Predicted drug pairs are investigated in more robust datasets, including electronic medical records (EMRs) and, for one pair, laboratory testing.
16- Lorberbaum T.
- Sampson K.J.
- Woosely R.L.
- et al.
An integrative data science pipeline to identify novel drug interactions that prolong the QT interval.
, 18- Lorberbaum T.
- Sampson K.J.
- Chang J.B.
- et al.
Coupling data mining and laboratory experiments to discover drug interactions causing QT prolongation.
Drug pairs ultimately highlighted are termed “validated drug–drug interactions” (VDDIs).
16- Lorberbaum T.
- Sampson K.J.
- Woosely R.L.
- et al.
An integrative data science pipeline to identify novel drug interactions that prolong the QT interval.
The bioRχiv publication reports 32 VDDIs of 1310 positive per the FAERS fingerprint, whereas the index peer-reviewed publication
16- Lorberbaum T.
- Sampson K.J.
- Woosely R.L.
- et al.
An integrative data science pipeline to identify novel drug interactions that prolong the QT interval.
reports 8 VDDIs of 889 pairs positive per the FAERS fingerprint.
16- Lorberbaum T.
- Sampson K.J.
- Woosely R.L.
- et al.
An integrative data science pipeline to identify novel drug interactions that prolong the QT interval.
, 19Lorberbaum T, Sampson KJ, Woosley RL, et al. BioRχ. Data science identifies novel drug interactions that prolong the QT-interval. doi. 10.1101/024745.
The authors recommend additional studies to confirm interaction mechanisms, rather than investigating whether each drug pair is, for example, a surrogate for a high-risk population.
Claimed Validation of DDIs Versus Omitted Clinical Context
Highlighted as the largest effect size (ie, the greatest increase in the QTc interval) was lactulose/octreotide coadministration in male subjects.
16- Lorberbaum T.
- Sampson K.J.
- Woosely R.L.
- et al.
An integrative data science pipeline to identify novel drug interactions that prolong the QT interval.
, 19Lorberbaum T, Sampson KJ, Woosley RL, et al. BioRχ. Data science identifies novel drug interactions that prolong the QT-interval. doi. 10.1101/024745.
The following clinical context was provided: Lactulose is “administered to treat constipation” and octreotide is “used to lower growth hormone levels.”
16- Lorberbaum T.
- Sampson K.J.
- Woosely R.L.
- et al.
An integrative data science pipeline to identify novel drug interactions that prolong the QT interval.
Both claims are true. The seemingly disconnected indications amplify the aura of “latency” and unexpectedness.
However, a key clinical context is missing: lactulose is a nonabsorbable disaccharide that reduces colonic ammonia production and absorption, and thus a first-line treatment for hepatic encephalopathy
25- Al Sibae M.R.
- McGuire B.M.
Current trends in the treatment of hepatic encephalopathy.
, (its approved labeled indication).
Octreotide is also used to treat bleeding esophageal varices in hepatic cirrhosis/portal hypertension.
27Octreotide in variceal bleeding.
, 28Pharmacological rationale for the use of somatostatin and analogues in portal hypertension.
Through a lens of clinical judgment, proximate coadministration of these drugs may be highly selective for circumscribed clinical scenarios (eg, cirrhosis, with its documented multifactorial link to QT interval prolongation, and TdP, positively modified by male sex).
29- Kim S.M.
- Bennet G.
- Diego A.F.
- et al.
QT prolongation is associated with increased mortality in end stage liver disease.
, 30- Bernardi M.
- Calandra S.
- Colantoni A.
- et al.
QT-interval prolongation in cirrhosis: prevalence, relationship with severity and etiology of disease and possible pathogenic factors.
, 31- Jiménez J.V.
- Carrillo-Pérez D.L.
- Rosado-Canto R.
- et al.
Electrolyte and acid–base disturbances in end-stage liver disease: a physiopathological approach.
Applying propensity scoring to the FAERS data on dimensions of co-medications and indications is not terribly reassuring because diagnostics necessary to assess the adequacy of the propensity scoring such as model fit and covariate balance
32Reporting of covariate selection and balance assessment in propensity score analysis is suboptimal: a systematic review.
are not apparent, which is especially important given the qualitative and quantitative data deficits in FAERS.
33Maciejewski M, Lounkine E, Whitebread S, et al. The powers and perils of post-marketing data analysis: quantification and mitigation of biases in the FDA adverse event reporting system. BioRxiv doi.10.1101/068692 elife doi. 10.7554/25818.
Similarly, their confounding analysis (ANCOVA) applied to the EMRs for medications associated with QT interval prolongation is not remedial.
Therefore, this VDDI has at least one fundamental, readily accessible, and assessable alternative clinical explanation, for which the authors' implemented control procedures would be ineffective. This scenario is a reminder that “While it might seem obvious that data, no matter how ‘big,’ cannot perfectly represent life in all its complexity, information technology produces so much information that it is easy to forget just how much is missing.”
34The problem with our data obsession.
None of this disproves the claimed VDDI. Indeed, a clinical lens brings into focus that acquired TdP typically entails convergence of multiple risk factors; for example, hypokalemia, hypothyroidism, bradycardia, and hypoglycemia.
35Drug-induced QT interval prolongation and torsades de pointes: role of the pharmacist in risk assessment, prevention and management.
, 36- Zhang Y.
- Han H.
- Wang J.
- et al.
Impairment of human ether-a`-go–gorelated gene (HERG) K? channel function by hypoglycemia and hyperglycemia. Similar phenotypes but different mechanisms.
, 37- Hannudi F.
- Alwash H.
- Shah K.
- et al.
A case of hypoglycemia-induced QT prolongation leading to torsade de pointes and a review of pathophysiological mechanisms.
, 38Thyroid dysfunction in torsades de pointes.
Contrary to “our method allows us to identify adverse reactions that result from combination therapy no one would have suspected otherwise,”
20Identifying drug-drug interactions.
it is not unexpected that co-administration of 2 drugs with a combined safety profile (“fingerprint”), including hypokalemia (lactulose)
and hypothyroidism, hypoglycemia, bradycardia, and/or QT interval prolongation (octreotide),
might boost QT intervals/TdP risk compared with either drug independently. Although reinforcing clinical expectations is not devoid of value, and possibly reassuring in a performance assessment context, discovering VDDIs that are surprising even to those familiar with each drug’s pharmacologic fingerprint would be most noteworthy, but it is not clear that happened in these instances.
Another noteworthy example is a VDDI singled out by authors as the second largest effect size and largest in female subjects: vancomycin and mupirocin.
16- Lorberbaum T.
- Sampson K.J.
- Woosely R.L.
- et al.
An integrative data science pipeline to identify novel drug interactions that prolong the QT interval.
Not discussed is that mupirocin is formulated for topical use, has no detectable skin or mucosal penetration in healthy subjects, and has a negligible absorption even when substantial doses are applied under occlusion to 70% to 80% of the skin surface in female patients with inflammatory skin diseases, conditions expected to maximize percutaneous absorption.
40Mupirocin—are we in danger of losing it?.
, 41- Lawrence C.M.
- Mackenzie T.
- Pagano K.
- et al.
Systemic absorption of mupirocin ointment to healthy and dermatologically diseased skin.
This would have provided further perspective on the claimed validity of the proposed DDI, not to mention cogently reinforcing the finding's novelty, should this DDI ultimately be confirmed.
We could make analogous arguments for other VDDIs in these papers. Consequential omitted clinical context increases the risk of technically innovative research resulting in premature and overzealous conclusions about the existence of VDDIs.
Claimed Validation of DDIs Versus Omitted Clinical Pharmacologic Context
A subsequent article
18- Lorberbaum T.
- Sampson K.J.
- Chang J.B.
- et al.
Coupling data mining and laboratory experiments to discover drug interactions causing QT prolongation.
reporting the same data-mining results and additional laboratory test results for ceftriaxone/lansoprazole garnered considerable attention as mitigating previous study limitations. NEJM Journal Watch opined that despite multiple limitations of the data mining, “….the addition of laboratory testing that also showed QT prolongation with the combination of ceftriaxone and lansoprazole, but not with cefuroxime and lansoprazole, lends more credence to the identified QT-DDI signal being real.”
42QT prolongation risk with ceftriaxone and lansoprazole combination.
(The latter statement is inaccurate: the laboratory testing did not measure QT-intervals.)
With the publication’s emphasis on concordance between data-mining findings and laboratory testing, and given the protocol-dependence of hERG assays, limitations in execution and/or reporting of the laboratory studies illustrate the importance of omitted clinical pharmacologic context:
1. Concentrations of ceftriaxone and cefuroxime, ranging from 0.1 to 100 μM, were chosen to “include the range of plasma concentrations usually reached during routine clinical use for the drug … 35–428 μM” for cefuroxime.” Tested concentrations therefore did not fully encompass, but rather partly overlapped, expected therapeutic ranges, excluding maximum exposures encountered in clinical practice, particularly for cefuroxime (“negative control”); that is, an adequate safety margin was not established.
2. Ceftriaxone was tested with both lower and higher concentrations of lansoprazole, whereas cefuroxime was tested only with lower concentrations, perhaps due to solubility limits (cefuroxime is less soluble in dimethyl sulfoxide than ceftriaxone), not uncommon with hERG testing.
43- Brimecombe J.C.
- Kirsch G.E.
- Brown A.M.
Test article concentrations in the hERG assay. Losses through the perfusion, solubility and stability.
3. There were no reports of concentration verification, as required under Good Laboratory Practices because of numerous sources of test article loss in hERG assays due to physicochemical properties (eg, the aforementioned solubility limits).
43- Brimecombe J.C.
- Kirsch G.E.
- Brown A.M.
Test article concentrations in the hERG assay. Losses through the perfusion, solubility and stability.
, 44Early identification of hERG liability in drug discovery programs by automated patch clamp.
, 45- Redfern W.S.
- Carlsson L.
- Davis A.S.
- Lynch W.G.
- MacKenzie I.
- Palethorpe S.
- et al.
Relationships between preclinical cardiac electrophysiology, clinical QT interval prolongation and torsade de pointes for a broad range of drugs: evidence for a provisional safety margin in drug development.
4. Experimental details were missing that are routinely included in published hERG testing, such as temperature, cellular system, drug application protocol (eg, double drug application to ensure equilibrium drug effects), current–time plots, and perfusate composition.
46- Ducroq J.
- Printemps R.
- Le Grand M.
Additive effects of ziprasidone and D,L sotalol on the action potential in rabbit Purkinje fibres and on the hERG potassium current.
, 47- Zhang S.
- Zhou Z.
- Gong Q.
- et al.
Mechanism of block and identification of the verapamil binding domain to HERG potassium channels.
, 48Additive effects of combined application of multiple hERG blockers.
, 49Allosteric effects of erythromycin pretreatment on thioridazine block of hERG potassium channels.
5. The computational model is redundant with the hERG testing in the sense that it computes theoretical effects on action potential duration assuming the observed in vitro single channel effect is a faithful representation of the totality of channel effects in intact myocardial cells.
6. Statistical significance testing was performed with an unspecified “test for repeated measures.”
Omitted Clinical Context: Not Just About False-Positive Findings
Omitted clinical context does not lead only to premature positive conclusions (ie, claims of VDDIs). In another publication,
50- Tatonetti N.
- Fernald G.H.
- Altman R.B.
A novel signal detection algorithm for identifying hidden drug-drug interactions in adverse event reports.
a potential interaction between moxifloxacin and warfarin based on the pair's AE fingerprint for nephrotoxicity was dismissed based on an ANCOVA of EMRs in which “baseline creatine” was included as a covariate (we presume the authors meant “creatinine” and not “creatine” and that they used serum rather than urine measurements).
Warfarin nephropathy is a specific acute kidney injury from over-anticoagulation with hallmark biopsy findings of glomerular hemorrhage and obstructive intratubular red blood cell casts
51- Ghaswalla P.K.
- Haspe S.E.
- Tasine D.
- et al.
Warfarin-antibiotic interactions in older adults.
, 52- Golbin L.
- Vigneau C.
- Touchard G.
- et al.
Warfarin-related nephropathy induced by three different vitamin K antagonists: analysis of 13 biopsy-proven cases.
, 53Anticoagulants and acute kidney injury: clinical and pathology considerations.
, 54- Brodsky S.V.
- Nadasdy T.
- Rovin B.H.
Warfarin-related nephropathy occurs in patients with and without chronic kidney disease and is associated with an increased mortality rate.
; there is no specific
International Classification of Diseases, Ninth Revision (ICD-9), code. Therefore, a fundamental question is whether moxifloxacin can interact with warfarin to increase the international normalized ratio leading to warfarin nephropathy, rather than, or in addition to, inducing intrinsic nephrotoxicity. The authors' ANCOVA as presented is not dispositive for various reasons. For example, their conglomerate outcome definition (ICD-9 codes 580–589) includes a broad range of acute and chronic renal diseases and thus may be suboptimally coherent. Further ICD-9 codes for acute renal failure might be insensitive, and the magnitude of insensitivity may be situation dependent (eg, medical vs surgical settings).
55- Walkar S.S.
- Wald R.
- Chertow G.M.
- et al.
Validity of international classification of diseases, Ninth revision, Clinical Modifications codes for acute renal failure.
, 56- Shaffzin J.K.
- Dodd C.N.
- Nguyen H.
- et al.
Administrative data misclassifies and fails to identify nephrotoxin-associated acute kidney injury in hospitalized children.
, 57- Hougland P.
- Nebeker J.
- Pickard S.
- et al.
Using ICD-9-CM codes in hospital claims data to detect adverse events in patient safety surveillance.
These and other factors could result in bias, including bias to the null depending on the specifics. Clinical recognition of the mechanisms of over-anticoagulation also leads to consideration of whether other “anti-thrombotics” (we presume they meant anticoagulants) and other fluoroquinolones were probative comparators. On a statistical note, the ANCOVA used to dismiss this finding as a “false association” or “invalidated association” is too pivotal to be reported without full details to assess its statistical soundness; for example, potential violations of the statistical test's assumptions, a well-documented concern with this method.
58Misunderstanding analysis of covariance.
, 59- Glass V.G.
- Peckham P.D.
- Sanders J.R.
Consequences of failure to meet assumptions underlying the fixed effects analysis of variance and covariance.
, 60The effects of heterogeneous regression slopes on the robustness of two statistics in the analysis of covariance.
Precision is also desirable when describing risk of individual drugs. First reference set classifications may involve flawed adjudications
61Evidence of misclassification of drug-event associations classified as gold standard 'negative controls' by the observational medical outcomes partnership (OMOP).
or be in flux, as today's true negatives become tomorrow true positives. An example is this description of negative control drugs as having “no possible, conditional, established or congenital risk.”
16- Lorberbaum T.
- Sampson K.J.
- Woosely R.L.
- et al.
An integrative data science pipeline to identify novel drug interactions that prolong the QT interval.
This statement can be misinterpreted by readers unfamiliar with CredibleMeds to mean drugs for which risk is excluded. For drugs that are classified, CredibleMeds categorizes them into 1 of 4 categories: drugs with known risk of TdP, drugs with possible risk of TdP, drugs with conditional risk of TdP (ie, can cause TdP under certain conditions), and drugs to be avoided by patients with congenital long QT (the aforementioned drugs plus drugs with Special Risk). There is no category of “no possible risk…”
62- Schwartz P.J.
- Woosley R.L.
Predicting the unpredictable: drug-induced QT prolongation and torsades de pointes.
; in addition, it is important to note that drugs for which there is no available evidence may be unclassified at a given time point.
Another claim related to both the safety of individual drugs and, consequently, the novelty of DDIs identified by using the fingerprint method is: “Importantly, each individual drug had no previously known connection to the AE class of interest,”
16- Lorberbaum T.
- Sampson K.J.
- Woosely R.L.
- et al.
An integrative data science pipeline to identify novel drug interactions that prolong the QT interval.
which included diabetes. Paroxetine–pravastatin was one such pair, but preexisting publications already associated paroxetine with diabetes (possibly mediated by drug-induced weight gain
63- Andersohn F.
- Schade R.
- Suissa S.
- et al.
Long term use of antidepressants for depressive disorders and the risk of diabetes mellitus.
) and similarly with statins.
64- Sattar N.
- Preiss D.
- Murray H.M.
- et al.
Statins and the risk of incident diabetes: a collaborative meta-analysis of randomized statin trials.
Such imprecision may explain definitive media statements such “Interestingly, the drugs the investigators identified do not cause the condition on their own, but only when taken in specific combinations,”
22Big data offers new way to find hidden drug interactions.
potentially leading to inflated perceptions of safety as well as novelty.
Claims of Inability of Legacy Methods Versus Omitted Pharmacovigilance Judgment
Invisibility claims for VDDIs (eg, that ceftriaxone–lansoprazole “would not have been suspected using current surveillance methods” and would “not have been predicted in FAERS”
16- Lorberbaum T.
- Sampson K.J.
- Woosely R.L.
- et al.
An integrative data science pipeline to identify novel drug interactions that prolong the QT interval.
) and the related claim of superiority of fingerprint over “direct evidence” methods also merit discussion. Invisibility claims should be based on the full inventory of available methods that are variable and unstandardized combinations of methods, specific implementation, and databases, resulting in surprising variability and even nonreplicability.
65Revisiting the reported signal of acute pancreatitis with rasburicase: an object lesson in pharmacovigilance.
, 66Bevacizumab-associated diverticulitis: results of disproportionality analysis.
The fingerprint method outperformed 2 of many possible implementations based on direct evidence. Both are shrinkage-based methods, possibly not be the best option for designated medical events, especially with a truncated list of AE terms. Reasonable implementations of legacy methods would include mitigating sparsity with a modified MedDRA query based on the broad TdP Standardized MedDRA Query,
accommodating potential masking, an additional concern for rare events,
68- Maignen F.
- Hauben M.
- Hung E.
- Van Halle L.
- Dogne J.M.
Assessing the extent and impact of the masking effect of disproportionality analyses on two spontaneous reporting systems databases.
, 69- Wisniewski A.F.
- Bate A.
- Bousquet C.
- et al.
Good signal detection practices: evidence from IMI PROTECT.
and using frequentist disproportionality analysis.
70- Noguchi Y.
- Ueno A.
- Otsubo M.
- et al.
A new search method using association rule mining for drug-drug interaction based on spontaneous reporting system.
, 71Safety risk evaluation methodology in detecting the medicine concomitant use risk which might cause critical drug rash.
, 72- Ibrahim H.
- Saad A.
- Abdo A.
- et al.
Mining association patterns of drug-interactions using post marketing FDA's spontaneous reporting data.
In fact, a frequentist three-dimensional disproportionality analysis of FAERS data matching the end data cutoff as used in Tatonetti et al.
16- Lorberbaum T.
- Sampson K.J.
- Woosely R.L.
- et al.
An integrative data science pipeline to identify novel drug interactions that prolong the QT interval.
using commercial off-the-shelf software and filtering drugs assigned definite or possible TdP risk in CredibleMeds returned a relative reporting ratio for the ceftriaxone–lansoprazole pair and the event ventricular tachycardia of 17.6 versus 1.20 for ceftriaxone and 0.955 as single drugs (but not independently), consistent with a spontaneous reporting interaction.
User-Generated Electronic Content
The boldly titled article “Postmarket Surveillance Without Trial Costs: Discovery of Adverse Reactions Through Large-Scale Analysis of Web Search Queries”
73- Yom-Tov E.
- Gabrilovich E.
Postmarket drug surveillance without trial costs: discovery of adverse drug reactions through large scale analysis of web search queries.
reports 2 major findings. First, mining Internet search logs (ISLs) detected 20 of 24 adverse drug reactions (ADRs) that supposedly eluded detection by monitoring FAERS (“our method …. can assist in identifying ADRs that have so far eluded discovery by existing methods”). Second, unlike ADRs discoverable in FAERS, which “… are readily recognized by patients and medical professionals because of their acuteness and fast onset,” those discoverable only by the search logs include later-onset, less acute, nonserious ADRs.
The claims
73- Yom-Tov E.
- Gabrilovich E.
Postmarket drug surveillance without trial costs: discovery of adverse drug reactions through large scale analysis of web search queries.
rely on 4 steps: (1) identifying AEs that are overrepresented in FAERS versus overrepresented in ISLs; (2) assume the time between the first Internet search for the drug and the first Internet search for a drug–event pair is a surrogate for time-to-onset (TTO) of that ADR; (3) comparing the TTOs for AEs overrepresented in AERS versus those overrepresented in ISL; and (4) clinical adjudication of these terms, including relatedness, acuteness, and seriousness.
To the first claim,
73- Yom-Tov E.
- Gabrilovich E.
Postmarket drug surveillance without trial costs: discovery of adverse drug reactions through large scale analysis of web search queries.
using commercial off-the-shelf disproportionality analysis software, we discovered that many/most ADRs reportedly underrepresented in FAERS were over-represented and/or even in the first product label a median of 10 years before the date of the analyzed data.
74Hauben M, Iannos C. [Unpublished data].
The discordance is largely explainable by the following: (1) the failure to appreciate semantic/lexical variations of a given clinical phenotype (eg, considering muscle cramps vs muscle spasms as distinct disorders); and/or (2) focusing on proprietary versions of a mature product, the latter of which had generated statistical reporting signals earlier in the product life cycle prior to the time intervals analyzed and widespread reporting with the corresponding generic.
The underlying notion of the second claim
73- Yom-Tov E.
- Gabrilovich E.
Postmarket drug surveillance without trial costs: discovery of adverse drug reactions through large scale analysis of web search queries.
is a good one; in fact, it has already been reported that spontaneous reports and EHRs may usefully supplement each other, the former being more useful for rare events and the latter more effective for common ADRs.
75- Parcurariu A.C.
- Straus S.M.
- Trifiro G.
- et al.
Useful interplay between spontaneous ADR reports and electronic healthcare records in signal detection.
The TTOs reported to be significantly higher for ADRs overrepresented in ISLs versus FAERS may have resulted from misapplication of the Wilcoxon signed rank test, which assesses whether the median of a symmetrical distribution of differences is zero and should be applied to naturally paired, original observations. It seems that the investigators used a derivative set of drug-specific averages instead in order to use a paired test (ie, the variability of the original data was discarded, potentially compromising inferences).
The ensuing clinical judgments by the authors
73- Yom-Tov E.
- Gabrilovich E.
Postmarket drug surveillance without trial costs: discovery of adverse drug reactions through large scale analysis of web search queries.
are debatable: asthenia and malaise are acute and serious, whereas the related terms tired and weak are nonserious and delayed, constipation is acute and serious, whereas cramps and wound are not. Apnea, technically defined as temporary cessation of respiration, is delayed and not serious. We submit that some events may have not even been adverse events but epiphenomena of therapeutic response (eg, weight gain in patients treated with tumor necrosis factor inhibitors for inflammatory bowel disease).