A Continuously Benchmarked and Crowdsourced Challenge for Rapid Development and Evaluation of Models to Predict COVID-19 Diagnosis and Hospitalization

dc.contributor.author Yan, Yao
dc.contributor.author Schaffter, Thomas
dc.contributor.author Bergquist, Timothy
dc.contributor.author Yu, Thomas
dc.contributor.author Prosser, Justin
dc.contributor.author Aydin, Zafer
dc.contributor.author Jabeer, Amhar
dc.contributor.author Brugere, Ivan
dc.contributor.author Gao, Jifan
dc.contributor.author Chen, Guanhua
dc.contributor.author Causey, Jason
dc.contributor.author Yao, Yuxin
dc.contributor.author Bryson, Kevin
dc.contributor.author Long, Dustin R.
dc.contributor.author Jarvik, Jeffrey G.
dc.contributor.author Lee, Christoph, I
dc.contributor.author Wilcox, Adam
dc.contributor.author Guinney, Justin
dc.contributor.author Mooney, Sean
dc.contributor.department AGÜ, Mühendislik Fakültesi, Bilgisayar Mühendisliği Bölümü en_US
dc.contributor.institutionauthor Aydin, Zafer
dc.contributor.institutionauthor Jabeer, Amhar
dc.date.accessioned 2022-03-05T09:19:24Z
dc.date.available 2022-03-05T09:19:24Z
dc.date.issued 2021 en_US
dc.description This study was supported by the Clinical and Translational Science Awards Program National Center for Data to Health funding by the National Center for Advancing Translational Sciences at the National Institutes of Health (grant U24TR002306 [Ms Yan, Drs Schaffter, Bergquist, Guinney, and Mooney and Messrs Yu and Prosser]), Bill and Melinda Gates Foundation, the Institute for Translational Health Sciences (grant UL1 TR002319 [Dr Bergquist and Mooney and Mr Prosser]), and NDational Institutes of Health/National Institute of General Medical Sciences Anesthesiology and Perioperative Medicine Research Training (grant T32 GM086270 [Dr Long]). The CLEAR center was supported by the National Institute of Arthritis and Musculoskeletal and Skin Diseases (grant P30AR072572). en_US
dc.description.abstract IMPORTANCE Machine learning could be used to predict the likelihood of diagnosis and severity of illness. Lack of COVID-19 patient data has hindered the data science community in developing models to aid in the response to the pandemic. OBJECTIVES To describe the rapid development and evaluation of clinical algorithms to predict COVID-19 diagnosis and hospitalization using patient data by citizen scientists, provide an unbiased assessment of model performance, and benchmark model performance on subgroups. DESIGN, SETTING, AND PARTICIPANTS This diagnostic and prognostic study operated a continuous, crowdsourced challenge using a model-to-data approach to securely enable the use of regularly updated COVID-19 patient data from the University of Washington by participants from May 6 to December 23, 2020. A postchallenge analysis was conducted from December 24, 2020, to April 7, 2021, to assess the generalizability of models on the cumulative data set as well as subgroups stratified by age, sex, race, and time of COVID-19 test. By December 23, 2020, this challenge engaged 482 participants from 90 teams and 7 countries. MAIN OUTCOMES AND MEASURES Machine learning algorithms used patient data and output a score that represented the probability of patients receiving a positive COVID-19 test result or being hospitalized within 21 days after receiving a positive COVID-19 test result. Algorithms were evaluated using area under the receiver operating characteristic curve (AUROC) and area under the precision recall curve (AUPRC) scores. Ensemble models aggregating models from the top challenge teams were developed and evaluated. RESULTS In the analysis using the cumulative data set, the best performance for COVID-19 diagnosis prediction was an AUROC of 0.776 (95% CI, 0.775-0.777) and an AUPRC of 0.297, and for hospitalization prediction, an AUROC of 0.796 (95% CI, 0.794-0.798) and an AUPRC of 0.188. Analysis on top models submitting to the challenge showed consistently better model performance on the female group than the male group. Among all age groups, the best performance was obtained for the 25- to 49-year age group, and the worst performance was obtained for the group aged 17 years or younger. CONCLUSIONS AND RELEVANCE In this diagnostic and prognostic study, models submitted by citizen scientists achieved high performance for the prediction of COVID-19 testing and hospitalization outcomes. Evaluation of challenge models on demographic subgroups and prospective data revealed performance discrepancies, providing insights into the potential bias and limitations in the models. en_US
dc.description.sponsorship Clinical and Translational Science Awards Program National Center for Data to Health - National Center for Advancing Translational Sciences at the National Institutes of Health U24TR002306 Bill & Melinda Gates Foundation Institute for Translational Health Sciences UL1 TR002319 National Institutes of Health/National Institute of General Medical Sciences Anesthesiology and Perioperative Medicine Research Training T32 GM086270 United States Department of Health & Human Services National Institutes of Health (NIH) - USA NIH National Institute of Arthritis & Musculoskeletal & Skin Diseases (NIAMS) P30AR072572 en_US
dc.identifier.issn 2574-3805
dc.identifier.other PubMed ID34633425
dc.identifier.uri https //doi.org/10.1001/jamanetworkopen.2021.24946
dc.identifier.uri https://hdl.handle.net/20.500.12573/1243
dc.identifier.volume Volume 4 Issue 10 en_US
dc.language.iso eng en_US
dc.publisher AMER MEDICAL ASSOC330 N WABASH AVE, STE 39300, CHICAGO, IL 60611-5885 en_US
dc.relation.isversionof 10.1001/jamanetworkopen.2021.24946 en_US
dc.relation.journal JAMA NETWORK OPEN en_US
dc.relation.publicationcategory Makale - Uluslararası - Editör Denetimli Dergi en_US
dc.rights info:eu-repo/semantics/openAccess en_US
dc.title A Continuously Benchmarked and Crowdsourced Challenge for Rapid Development and Evaluation of Models to Predict COVID-19 Diagnosis and Hospitalization en_US
dc.type article en_US

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
A Continuously Benchmarked and Crowdsourced Challenge for Rapid Development and Evaluation of Models to Predict COVID-19 Diagnosis and Hospitalization.pdf
Size:
1.38 MB
Format:
Adobe Portable Document Format
Description:
Makale Dosyası

License bundle

Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
1.44 KB
Format:
Item-specific license agreed upon to submission
Description: