AI4Pandemics Talk #14: Fahad Ahmed, Wayne State University
Title: Actionable outcomes prediction: Mortality prediction in COVID-19 pandemic and beyond
Abstract:
Background
The nextwave of COVID-19 pandemic is anticipated to be worse than the initial one and will strain the healthcare systems even more during the winter months. Our aim was to develop a novel machine learning-based model to predict mortality using the deep learning Neo-V framework. We hypothesized this novel machine learning approach could be applied to COVID-19 patients to predict mortality successfully with high accuracy.
Methods
We collected clinical and laboratory data prospectively on all adult patients (≥18 years of age) that were admitted in the inpatient setting at Aga Khan University Hospital between February 2020 and September 2020 with a clinical diagnosis of COVID-19 infection. Only patients with a RT-PCR (reverse polymerase chain reaction) proven COVID-19 infection and complete medical records were included in this study. A Novel 3-phase machine learning framework was developed to predict mortality in the inpatients setting. Phase 1 included variable selection that was done using univariate and multivariate Cox-regression analysis; all variables that failed the regression analysis were excluded from the machine learning phase of the study. Phase 2 involved new-variables creation and selection. Phase 3 and final phase applied deep neural networks and other traditional machine learning models like Decision Tree Model, k-nearest neighbor models, etc. The accuracy of these models was evaluated using test-set accuracy, sensitivity, specificity, positive predictive values, negative predictive values and area under the receiver-operating curves. A new external validation dataset was collected from Detroit Medical center in the United States. Validation of was done on this dataset (test-set accuracy, sensitivity, specificity, positive predictive values, negative predictive values and area under the receiver-operating curves).
Results
After application of inclusion and exclusion criteria (n=)1214 patients were selected from a total of 1228 admitted patients. We observed that several clinical and laboratory-based variables were statistically significant for both univariate and multivariate analyses while others were not. With most significant being septic shock (hazard ratio [HR], 4.30; 95% confidence interval [CI], 2.91–6.37), supportive treatment (HR, 3.51; 95% CI, 2.01–6.14), abnormal international normalized ratio (INR) (HR, 3.24; 95% CI, 2.28–4.63), admission to the intensive care unit (ICU) (HR, 3.24; 95% CI, 2.22–4.74), treatment with invasive ventilation (HR, 3.21; 95% CI, 2.15–4.79) and laboratory lymphocytic derangement (HR, 2.79; 95% CI, 1.6–4.86). Machine learning results showed our deep neural network (DNN) (Neo-V) model outperformed all conventional machine learning models with test set accuracy of 99.53%, sensitivity of 89.87%, and specificity of 95.63%; positive predictive value, 50.00%; negative predictive value, 91.05%; and area under the receiver-operator curve of 88.5. The external validation shows test set accuracy of 64.5%, sensitivity of 84.3%, specificity of 34.6%, positive predicative value of 66.0%, negative predicative value of 59.4%, true positive = 70, true negative = 36, false positive = 19 and false negative = 13.
Conclusion
Our novel Deep-Neo-V model outperformed all other machine learning models during development and showed of good accuracy in one of the external validation. The model is easy to implement, user friendly and with high accuracy.
Keywords: COVI-19, Pandemic, Machine Learning, Deep Neural Network, Mortality, SARS-COV-2
About AI4PAN Artificial Intelligence for Pandemics Seminar Series centred at UQ
Welcome to AI4PAN, the Artificial Intelligence for Pandemics group centered at The University of Queensland (UQ). The group's focus is the application of data science, machine learning, statistical learning, applied mathematics, computation, and other "artificial intelligence" techniques for managing pandemics both at the epidemic and clinical level.