layout | title | permalink |
---|---|---|
page |
Clinical Informatics |
/Clinical_Informatics/ |
-
BIDS - Shimon Ben Boursi, Brian Finkelman, Bruce J. Giantonio, Kevin Haynes, Anil K. Rustgi, Andrew Rhim, Ronac Mamtani, and Yu-Xiao Yang. A clinical prediction model to assess risk for pancreatic cancer among patients with pre-diabetes. Journal of Clinical Oncology 2018 36:15_suppl, e16226-e16226
Background: A new diagnosis of diabetes mellitus (DM) has been observed within 2-3 years before the clinical presentation of a substantial proportion of sporadic pancreatic ductal adenocarcinoma (PDA) cases. The objective of the current study was to develop and internally validate a clinical prediction model for 3-year PDA risk among individuals with pre-diabetes defined as newly detected impaired fasting glucose (IFG). Methods: The study was conducted using the Health Improvement Network (THIN), a population-representative general practice medical records database from the UK. Eligible patients were aged ≥35 years, had a newly diagnosed IFG during follow-up and had ≥ 3 years of follow-up following the diagnosis of IFG. The dependent variable was incident PDA diagnosed within 3 years of IFG diagnosis. We evaluated a comprehensive list of PDA risk factors as well as variables related to glucose metabolism and performed multiple imputation for candidate predictors with missing values. We selected predictors using univariable analysis and a backward stepwise approach. A bootstrapping procedure was used for internal validation. Results: We identified 138,232 eligible patients with new-onset IFG. Among this cohort, 245 individuals (0.2%) were diagnosed with PDA within 3 years of IFG diagnosis. The median follow-up from IFG detection to PDA diagnosis was 326 days (IQR 120-588). The full multivariable prediction model included age, BMI, PPIs, total cholesterol, LDL, ALT and alkaline phosphatase. The AUC of the model was 0.71 (95%CI 0.67-0.75), and the p-value for the Hosmer and Lemeshow goodness of fit test was > 0.05. Internal validation of the model using bootstrapping procedure revealed minimal optimism of 0.001 (95%CI -0.009-0.01). Conclusions: We developed and internally validated a PDA prediction model based on clinical information routinely available at the initial appearance of IFG. This model exhibited good discrimination and calibration. The current model may substantially extend our ability to detect subclinical PDAs at an earlier stage when the tumor may potentially be more amenable to curative resection.
-
BIDS - Golas, S.B., Shibahara, T., Agboola, S. et al. A machine learning model to predict the risk of 30-day readmissions in patients with heart failure: a retrospective analysis of electronic medical records data. BMC Med Inform Decis Mak 18, 44 (2018). https://doi.org/10.1186/s12911-018-0620-z
Heart failure is one of the leading causes of hospitalization in the United States. Advances in big data solutions allow for storage, management, and mining of large volumes of structured and semi-structured data, such as complex healthcare data. Applying these advances to complex healthcare data has led to the development of risk prediction models to help identify patients who would benefit most from disease management programs in an effort to reduce readmissions and healthcare cost, but the results of these efforts have been varied. The primary aim of this study was to develop a 30-day readmission risk prediction model for heart failure patients discharged from a hospital admission. We used longitudinal electronic medical record data of heart failure patients admitted within a large healthcare system. Feature vectors included structured demographic, utilization, and clinical data, as well as selected extracts of un-structured data from clinician-authored notes. The risk prediction model was developed using deep unified networks (DUNs), a new mesh-like network structure of deep learning designed to avoid over-fitting. The model was validated with 10-fold cross-validation and results compared to models based on logistic regression, gradient boosting, and maxout networks. Overall model performance was assessed using concordance statistic. We also selected a discrimination threshold based on maximum projected cost saving to the Partners Healthcare system. Data from 11,510 patients with 27,334 admissions and 6369 30-day readmissions were used to train the model. After data processing, the final model included 3512 variables. The DUNs model had the best performance after 10-fold cross-validation. AUCs for prediction models were 0.664 ± 0.015, 0.650 ± 0.011, 0.695 ± 0.016 and 0.705 ± 0.015 for logistic regression, gradient boosting, maxout networks, and DUNs respectively. The DUNs model had an accuracy of 76.4% at the classification threshold that corresponded with maximum cost saving to the hospital. Deep learning techniques performed better than other traditional techniques in developing this EMR-based prediction model for 30-day readmissions in heart failure patients. Such models can be used to identify heart failure patients with impending hospitalization, enabling care teams to target interventions at their most high-risk patients and improving overall clinical outcomes.
-
BIDS - J. Zhang, K. Kowsari, J. H. Harrison, J. M. Lobo and L. E. Barnes, "Patient2Vec: A Personalized Interpretable Deep Representation of the Longitudinal Electronic Health Record," in IEEE Access, vol. 6, pp. 65333-65346, 2018, doi: 10.1109/ACCESS.2018.2875677.
The wide implementation of electronic health record (EHR) systems facilitates the collection of large-scale health data from real clinical settings. Despite the significant increase in adoption of EHR systems, these data remain largely unexplored, but present a rich data source for knowledge discovery from patient health histories in tasks, such as understanding disease correlations and predicting health outcomes. However, the heterogeneity, sparsity, noise, and bias in these data present many complex challenges. This complexity makes it difficult to translate potentially relevant information into machine learning algorithms. In this paper, we propose a computational framework, Patient2Vec, to learn an interpretable deep representation of longitudinal EHR data, which is personalized for each patient. To evaluate this approach, we apply it to the prediction of future hospitalizations using real EHR data and compare its predictive performance with baseline methods. Patient2Vec produces a vector space with meaningful structure, and it achieves an area under curve around 0.799, outperforming baseline methods. In the end, the learned feature importance can be visualized and interpreted at both the individual and population levels to bring clinical insights.
-
BIDS - Rajkomar A, Oren E, Chen K, Dai AM, Hajaj N, Hardt M, Liu PJ, Liu X, Marcus J, Sun M, Sundberg P, Yee H, Zhang K, Zhang Y, Flores G, Duggan GE, Irvine J, Le Q, Litsch K, Mossin A, Tansuwan J, Wang, Wexler J, Wilson J, Ludwig D, Volchenboum SL, Chou K, Pearson M, Madabushi S, Shah NH, Butte AJ, Howell MD, Cui C, Corrado GS, Dean J. Scalable and accurate deep learning with electronic health records. NPJ Digit Med. 2018 May 8;1:18. https://www.ncbi.nlm.nih.gov/pubmed/31304302
Predictive modeling with electronic health record (EHR) data is anticipated to drive personalized medicine and improve healthcare quality. Constructing predictive statistical models typically requires extraction of curated predictor variables from normalized EHR data, a labor-intensive process that discards the vast majority of information in each patient's record. We propose a representation of patients' entire raw EHR records based on the Fast Healthcare Interoperability Resources (FHIR) format. We demonstrate that deep learning methods using this representation are capable of accurately predicting multiple medical events from multiple centers without site-specific data harmonization. We validated our approach using de-identified EHR data from two US academic medical centers with 216,221 adult patients hospitalized for at least 24 h. In the sequential format we propose, this volume of EHR data unrolled into a total of 46,864,534,945 data points, including clinical notes. Deep learning models achieved high accuracy for tasks such as predicting: in-hospital mortality (area under the receiver operator curve [AUROC] across sites 0.93-0.94), 30-day unplanned readmission (AUROC 0.75-0.76), prolonged length of stay (AUROC 0.85-0.86), and all of a patient's final discharge diagnoses (frequency-weighted AUROC 0.90). These models outperformed traditional, clinically-used predictive models in all cases. We believe that this approach can be used to create accurate and scalable predictions for a variety of clinical scenarios. In a case study of a particular prediction, we demonstrate that neural networks can be used to identify relevant information from the patient's chart.
-
BIDS - John W. Park, M.D., Minetta C. Liu, M.D., Douglas Yee, M.D., Christina Yau, Ph.D., Laura J. van ’t Veer, Ph.D., W. Fraser Symmans, M.D., Melissa Paoloni, D.V.M., Jane Perlmutter, Ph.D., Nola M. Hylton, Ph.D., Michael Hogarth, M.D., Angela DeMichele, M.D., Meredith B. Buxton, Ph.D., et al., for the I-SPY 2 Investigators. Adaptive Randomization of Neratinib in Early Breast Cancer. July 7, 2016 N Engl J Med 2016; 375:11-22. DOI: 10.1056/NEJMoa1513750. https://www.nejm.org/doi/full/10.1056/NEJMoa1513750.
The heterogeneity of breast cancer makes identifying effective therapies challenging. The I-SPY 2 trial, a multicenter, adaptive phase 2 trial of neoadjuvant therapy for high-risk clinical stage II or III breast cancer, evaluated multiple new agents added to standard chemotherapy to assess the effects on rates of pathological complete response (i.e., absence of residual cancer in the breast or lymph nodes at the time of surgery). We used adaptive randomization to compare standard neoadjuvant chemotherapy plus the tyrosine kinase inhibitor neratinib with control. Neratinib reached the prespecified efficacy threshold with regard to the HER2-positive, hormone-receptor–negative signature. Neratinib added to standard therapy was highly likely to result in higher rates of pathological complete response than standard chemotherapy with trastuzumab among patients with HER2-positive, hormone-receptor–negative breast cancer.
-
BIDS - Poplin, Ryan, et al. Prediction of cardiovascular risk factors from retinal fundus photographs via deep learning. Nature Biomedical Engineering 2.3 (2018): 158. https://www.ncbi.nlm.nih.gov/pubmed/31015713
Using deep-learning models trained on data from 284,335 patients and validated on two independent datasets of 12,026 and 999 patients, we predicted cardiovascular risk factors not previously thought to be present or quantifiable in retinal images, such as age (mean absolute error within 3.26 years), gender (area under the receiver operating characteristic curve (AUC) = 0.97), smoking status (AUC = 0.71), systolic blood pressure (mean absolute error within 11.23 mmHg) and major adverse cardiac events (AUC = 0.70). We also show that the trained deep-learning models used anatomical features, such as the optic disc or blood vessels, to generate each prediction.
-
BIDS - Biases in electronic health record data due to processes within the healthcare system: retrospective observational study. BMJ 2018; 361 doi: https://doi.org/10.1136/bmj.k1479
Objective: To evaluate on a large scale, across 272 common types of laboratory tests, the impact of healthcare processes on the predictive value of electronic health record (EHR) data. Design: Retrospective observational study. Setting: Two large hospitals in Boston, Massachusetts, with inpatient, emergency, and ambulatory care. Participants: All 669 452 patients treated at the two hospitals over one year between 2005 and 2006. Main outcome measures: The relative predictive accuracy of each laboratory test for three year survival, using the time of the day, day of the week, and ordering frequency of the test, compared to the value of the test result. Results: The presence of a laboratory test order, regardless of any other information about the test result, has a significant association (P<0.001) with the odds of survival in 233 of 272 (86%) tests. Data about the timing of when laboratory tests were ordered were more accurate than the test results in predicting survival in 118 of 174 tests (68%). Conclusions: Healthcare processes must be addressed and accounted for in analysis of observational health data. Without careful consideration to context, EHR data are unsuitable for many research questions. However, if explicitly modeled, the same processes that make EHR data complex can be leveraged to gain insight into patients’ state of health.
-
BIDS - Riccardo Miotto, Li Li, Brian A. Kidd & Joel T. Dudley. Deep Patient: An Unsupervised Representation to Predict the Future of Patients from the Electronic Health Records. Scientific Reports. 2016. 6:26094. https://pubmed.ncbi.nlm.nih.gov/27185194/
Secondary use of electronic health records (EHRs) promises to advance clinical research and better inform clinical decision making. Challenges in summarizing and representing patient data prevent widespread practice of predictive modeling using EHRs. Here we present a novel unsupervised deep feature learning method to derive a general-purpose patient representation from EHR data that facilitates clinical predictive modeling. In particular, a three-layer stack of denoising autoencoders was used to capture hierarchical regularities and dependencies in the aggregated EHRs of about 700,000 patients from the Mount Sinai data warehouse. The result is a representation we name "deep patient". We evaluated this representation as broadly predictive of health states by assessing the probability of patients to develop various diseases. We performed evaluation using 76,214 test patients comprising 78 diseases from diverse clinical domains and temporal windows. Our results significantly outperformed those achieved using representations based on raw EHR data and alternative feature learning strategies. Prediction performance for severe diabetes, schizophrenia, and various cancers were among the top performing. These findings indicate that deep learning applied to EHRs can derive patient representations that offer improved clinical predictions, and could provide a machine learning framework for augmenting clinical decision systems.
-
BIDS - Trang Pham, Truyen Tran, Dinh Phung, Svetha Venkatesh. Predicting healthcare trajectories from medical records: A deep learning approach. Journal of Biomedical Informatics. 2017. 69:218-229 https://pubmed.ncbi.nlm.nih.gov/28410981/
Personalized predictive medicine necessitates the modeling of patient illness and care processes, which inherently have long-term temporal dependencies. Healthcare observations, stored in electronic medical records are episodic and irregular in time. We introduce DeepCare, an end-to-end deep dynamic neural network that reads medical records, stores previous illness history, infers current illness states and predicts future medical outcomes. At the data level, DeepCare represents care episodes as vectors and models patient health state trajectories by the memory of historical records. Built on Long Short-Term Memory (LSTM), DeepCare introduces methods to handle irregularly timed events by moderating the forgetting and consolidation of memory. DeepCare also explicitly models medical interventions that change the course of illness and shape future medical risk. Moving up to the health state level, historical and present health states are then aggregated through multiscale temporal pooling, before passing through a neural network that estimates future outcomes. We demonstrate the efficacy of DeepCare for disease progression modeling, intervention recommendation, and future risk prediction. On two important cohorts with heavy social and economic burden - diabetes and mental health - the results show improved prediction accuracy.
-
BIDS - Edward Choi, Andy Schuetz, Walter F Stewart, Jimeng Sun. Using recurrent neural network models for early detection of heart failure onset. Journal of the American Medical Informatics Association. 2017. 24:2 361-370. https://pubmed.ncbi.nlm.nih.gov/27521897/
We explored whether use of deep learning to model temporal relations among events in electronic health records (EHRs) would improve model performance in predicting initial diagnosis of heart failure (HF) compared to conventional methods that ignore temporality. Materials and Methods: Data were from a health system’s EHR on 3884 incident HF cases and 28 903 controls, identified as primary care patients, between May 16, 2000, and May 23, 2013. Recurrent neural network (RNN) models using gated recurrent units (GRUs) were adapted to detect relations among time-stamped events (eg, disease diagnosis, medication orders, procedure orders, etc.) with a 12- to 18-month observation window of cases and controls. Model performance metrics were compared to regularized logistic regression, neural network, support vector machine, and K-nearest neighbor classifier approaches. Using a 12-month observation window, the area under the curve (AUC) for the RNN model was 0.777, compared to AUCs for logistic regression (0.747), multilayer perceptron (MLP) with 1 hidden layer (0.765), support vector machine (SVM) (0.743), and K-nearest neighbor (KNN) (0.730). When using an 18-month observation window, the AUC for the RNN model increased to 0.883 and was significantly higher than the 0.834 AUC for the best of the baseline methods (MLP). Deep learning models adapted to leverage temporal relations appear to improve performance of models for detection of incident heart failure with a short observation window of 12–18 months.
-
BIDS - Segar, M.W., Hall, J.L., Jhund, P.S., Powell-Wiley, T.M., Morris, A.A., Kao, D., Fonarow, G.C., Hernandez, R., Ibrahim, N.E., Rutan, C. and Navar, A.M., 2022. Machine learning–based models incorporating social determinants of health vs traditional models for predicting in-hospital mortality in patients with heart failure. JAMA cardiology, 7(8), pp.844-854. https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9260645/
Importance: Traditional models for predicting in-hospital mortality for patients with heart failure (HF) have used logistic regression and do not account for social determinants of health (SDOH). Objective: To develop and validate novel machine learning (ML) models for HF mortality that incorporate SDOH. Design, setting, and participants: This retrospective study used the data from the Get With The Guidelines-Heart Failure (GWTG-HF) registry to identify HF hospitalizations between January 1, 2010, and December 31, 2020. The study included patients with acute decompensated HF who were hospitalized at the GWTG-HF participating centers during the study period. Data analysis was performed January 6, 2021, to April 26, 2022. External validation was performed in the hospitalization cohort from the Atherosclerosis Risk in Communities (ARIC) study between 2005 and 2014. Main outcomes and measures: Random forest-based ML approaches were used to develop race-specific and race-agnostic models for predicting in-hospital mortality. Performance was assessed using C index (discrimination), regression slopes for observed vs predicted mortality rates (calibration), and decision curves for prognostic utility. Results: The training data set included 123 634 hospitalized patients with HF who were enrolled in the GWTG-HF registry (mean [SD] age, 71 [13] years; 58 356 [47.2%] female individuals; 65 278 [52.8%] male individuals. Patients were analyzed in 2 categories: Black (23 453 [19.0%]) and non-Black (2121 [2.1%] Asian; 91 154 [91.0%] White, and 6906 [6.9%] other race and ethnicity). The ML models demonstrated excellent performance in the internal testing subset (n = 82 420) (C statistic, 0.81 for Black patients and 0.82 for non-Black patients) and in the real-world-like cohort with less than 50% missingness on covariates (n = 553 506; C statistic, 0.74 for Black patients and 0.75 for non-Black patients). In the external validation cohort (ARIC registry; n = 1205 Black patients and 2264 non-Black patients), ML models demonstrated high discrimination and adequate calibration (C statistic, 0.79 and 0.80, respectively). Furthermore, the performance of the ML models was superior to the traditional GWTG-HF risk score model (C index, 0.69 for both race groups) and other rederived logistic regression models using race as a covariate. The performance of the ML models was identical using the race-specific and race-agnostic approaches in the GWTG-HF and external validation cohorts. In the GWTG-HF cohort, the addition of zip code-level SDOH parameters to the ML model with clinical covariates only was associated with better discrimination, prognostic utility (assessed using decision curves), and model reclassification metrics in Black patients (net reclassification improvement, 0.22 [95% CI, 0.14-0.30]; P < .001) but not in non-Black patients. Conclusions and relevance: ML models for HF mortality demonstrated superior performance to the traditional and rederived logistic regressions models using race as a covariate. The addition of SDOH parameters improved the prognostic utility of prediction models in Black patients but not non-Black patients in the GWTG-HF registry.