A Continuously Benchmarked and Crowdsourced Challenge for Rapid Development and Evaluation of Models to Predict COVID-19 Diagnosis and Hospitalization

dc.contributor.author Yan, Yao
dc.contributor.author Schaffter, Thomas
dc.contributor.author Bergquist, Timothy
dc.contributor.author Yu, Thomas
dc.contributor.author Prosser, Justin
dc.contributor.author Aydin, Zafer
dc.contributor.author Mooney, Sean
dc.date.accessioned 2025-09-25T10:38:24Z
dc.date.available 2025-09-25T10:38:24Z
dc.date.issued 2021
dc.description Bryson, Kevin/0000-0002-1163-6368; Jujjavarapu, Chethan/0000-0003-1604-2312; Gunn, Martin/0000-0001-9879-8660; Yu, Thomas/0000-0002-5841-0198; Yao, Yuxin/0000-0002-7356-5542; Schaffter, Thomas/0000-0002-8242-9462; O'Reilly-Shah, Vikas/0000-0003-0741-0291; Guinney, Justin H/0000-0003-1477-1888; Causey, Jason/0000-0002-3985-2919 en_US
dc.description.abstract IMPORTANCE Machine learning could be used to predict the likelihood of diagnosis and severity of illness. Lack of COVID-19 patient data has hindered the data science community in developing models to aid in the response to the pandemic. OBJECTIVES To describe the rapid development and evaluation of clinical algorithms to predict COVID-19 diagnosis and hospitalization using patient data by citizen scientists, provide an unbiased assessment of model performance, and benchmark model performance on subgroups. DESIGN, SETTING, AND PARTICIPANTS This diagnostic and prognostic study operated a continuous, crowdsourced challenge using a model-to-data approach to securely enable the use of regularly updated COVID-19 patient data from the University of Washington by participants from May 6 to December 23, 2020. A postchallenge analysis was conducted from December 24, 2020, to April 7, 2021, to assess the generalizability of models on the cumulative data set as well as subgroups stratified by age, sex, race, and time of COVID-19 test. By December 23, 2020, this challenge engaged 482 participants from 90 teams and 7 countries. MAIN OUTCOMES AND MEASURES Machine learning algorithms used patient data and output a score that represented the probability of patients receiving a positive COVID-19 test result or being hospitalized within 21 days after receiving a positive COVID-19 test result. Algorithms were evaluated using area under the receiver operating characteristic curve (AUROC) and area under the precision recall curve (AUPRC) scores. Ensemble models aggregating models from the top challenge teams were developed and evaluated. RESULTS In the analysis using the cumulative data set, the best performance for COVID-19 diagnosis prediction was an AUROC of 0.776 (95% CI, 0.775-0.777) and an AUPRC of 0.297, and for hospitalization prediction, an AUROC of 0.796 (95% CI, 0.794-0.798) and an AUPRC of 0.188. Analysis on top models submitting to the challenge showed consistently better model performance on the female group than the male group. Among all age groups, the best performance was obtained for the 25- to 49-year age group, and the worst performance was obtained for the group aged 17 years or younger. CONCLUSIONS AND RELEVANCE In this diagnostic and prognostic study, models submitted by citizen scientists achieved high performance for the prediction of COVID-19 testing and hospitalization outcomes. Evaluation of challenge models on demographic subgroups and prospective data revealed performance discrepancies, providing insights into the potential bias and limitations in the models. en_US
dc.description.sponsorship Clinical and Translational Science Awards Program National Center for Data to Health - National Center for Advancing Translational Sciences at the National Institutes of Health [U24TR002306]; Bill and Melinda Gates Foundation; Institute for Translational Health Sciences [UL1 TR002319]; National Institutes of Health/National Institute of General Medical Sciences Anesthesiology and Perioperative Medicine Research Training [T32 GM086270]; National Institute of Arthritis and Musculoskeletal and Skin Diseases [P30AR072572]; National Institute of Arthritis and Musculoskeletal and Skin Diseases [P30AR072572] Funding Source: NIH RePORTER en_US
dc.description.sponsorship This study was supported by the Clinical and Translational Science Awards Program National Center for Data to Health funding by the National Center for Advancing Translational Sciences at the National Institutes of Health (grant U24TR002306 [Ms Yan, Drs Schaffter, Bergquist, Guinney, and Mooney and Messrs Yu and Prosser]), Bill and Melinda Gates Foundation, the Institute for Translational Health Sciences (grant UL1 TR002319 [Dr Bergquist and Mooney and Mr Prosser]), and NDational Institutes of Health/National Institute of General Medical Sciences Anesthesiology and Perioperative Medicine Research Training (grant T32 GM086270 [Dr Long]). The CLEAR center was supported by the National Institute of Arthritis and Musculoskeletal and Skin Diseases (grant P30AR072572). en_US
dc.identifier.doi 10.1001/jamanetworkopen.2021.24946
dc.identifier.issn 2574-3805
dc.identifier.scopus 2-s2.0-85116899723
dc.identifier.uri https://doi.org/10.1001/jamanetworkopen.2021.24946
dc.identifier.uri https://hdl.handle.net/20.500.12573/3048
dc.language.iso en en_US
dc.publisher Amer Medical Assoc en_US
dc.relation.ispartof Jama Network Open en_US
dc.rights info:eu-repo/semantics/openAccess en_US
dc.title A Continuously Benchmarked and Crowdsourced Challenge for Rapid Development and Evaluation of Models to Predict COVID-19 Diagnosis and Hospitalization en_US
dc.type Article en_US
dspace.entity.type Publication
gdc.author.id Bryson, Kevin/0000-0002-1163-6368
gdc.author.id Jujjavarapu, Chethan/0000-0003-1604-2312
gdc.author.id Gunn, Martin/0000-0001-9879-8660
gdc.author.id Yu, Thomas/0000-0002-5841-0198
gdc.author.id Yao, Yuxin/0000-0002-7356-5542
gdc.author.id Schaffter, Thomas/0000-0002-8242-9462
gdc.author.id Causey, Jason/0000-0002-3985-2919
gdc.author.scopusid 57219263449
gdc.author.scopusid 26027221900
gdc.author.scopusid 57194517780
gdc.author.scopusid 57192554101
gdc.author.scopusid 57219263140
gdc.author.scopusid 7003852510
gdc.author.scopusid 57225992956
gdc.author.wosid Yao, Yuxin/Jfb-3316-2023
gdc.author.wosid Bryson, Kevin/Aaz-8177-2020
gdc.author.wosid O'Reilly-Shah, Vikas/Ago-0420-2022
gdc.author.wosid Jarvik, Jeffrey/Aal-4292-2021
gdc.bip.impulseclass C4
gdc.bip.influenceclass C5
gdc.bip.popularityclass C4
gdc.coar.access open access
gdc.coar.type text::journal::journal article
gdc.collaboration.industrial true
gdc.description.department Abdullah Gül University en_US
gdc.description.departmenttemp [Yan, Yao; Schaffter, Thomas; Bergquist, Timothy; Yu, Thomas; Guinney, Justin] Sage Bionetworks, Washington, DC USA; [Yan, Yao] Univ Washington, Mol Engn & Sci Inst, Seattle, WA 98109 USA; [Bergquist, Timothy; Wilcox, Adam; Mooney, Sean] Univ Washington, Dept Biomed Informat & Med Educ, 850 Republican St, Seattle, WA 98109 USA; [Prosser, Justin] Univ Washington, Inst Translat Hlth Sci, Seattle, WA 98109 USA; [Aydin, Zafer; Jabeer, Amhar] Abdullah Gul Univ, Fac Engn, Dept Comp Engn, Kayseri, Turkey; [Brugere, Ivan] Univ Illinois, Dept Comp Sci, Chicago, IL USA; [Gao, Jifan; Chen, Guanhua] Univ Wisconsin, Dept Biostat & Med Informat, Madison, WI USA; [Causey, Jason] Arkansas State Univ, Comp Sci Dept, Coll Engn & Comp Sci, Jonesboro, AR USA; [Causey, Jason] Arkansas State Univ, Ctr Boundary Thinking, Arkansas AI Campus, Jonesboro, AR USA; [Yao, Yuxin; Bryson, Kevin] UCL, Dept Comp Sci, London, England; [Long, Dustin R.] Univ Washington, Dept Anesthesiol & Pain Med, Div Crit Care Med, Seattle, WA 98109 USA; [Jarvik, Jeffrey G.] Univ Washington Clin Learning, Evidence & Res Ctr Musculoskeletal Disorders, Seattle, WA USA; [Jarvik, Jeffrey G.; Lee, Christoph, I] Univ Washington, Sch Med, Dept Radiol, Seattle, WA 98109 USA en_US
gdc.description.issue 10 en_US
gdc.description.publicationcategory Makale - Uluslararası Hakemli Dergi - Kurum Öğretim Elemanı en_US
gdc.description.scopusquality Q1
gdc.description.startpage e2124946
gdc.description.volume 4 en_US
gdc.description.woscitationindex Science Citation Index Expanded
gdc.description.wosquality Q1
gdc.identifier.openalex W3206603959
gdc.identifier.pmid 34633425
gdc.identifier.wos WOS:000707431100004
gdc.index.type WoS
gdc.index.type Scopus
gdc.index.type PubMed
gdc.oaire.accesstype GOLD
gdc.oaire.diamondjournal false
gdc.oaire.downloads 86
gdc.oaire.impulse 9.0
gdc.oaire.influence 2.9472431E-9
gdc.oaire.isgreen true
gdc.oaire.keywords Adult
gdc.oaire.keywords Aged, 80 and over
gdc.oaire.keywords Male
gdc.oaire.keywords Adolescent
gdc.oaire.keywords Infant, Newborn
gdc.oaire.keywords COVID-19
gdc.oaire.keywords Infant
gdc.oaire.keywords Hospitalization
gdc.oaire.keywords Machine Learning
gdc.oaire.keywords Benchmarking
gdc.oaire.keywords COVID-19 Testing
gdc.oaire.keywords Area Under Curve
gdc.oaire.keywords Child, Preschool
gdc.oaire.keywords Clinical Decision Rules
gdc.oaire.keywords Crowdsourcing
gdc.oaire.keywords Humans
gdc.oaire.keywords Female
gdc.oaire.keywords Child
gdc.oaire.keywords Algorithms
gdc.oaire.keywords Original Investigation
gdc.oaire.keywords Aged
gdc.oaire.popularity 1.0988632E-8
gdc.oaire.publicfunded false
gdc.oaire.sciencefields 0301 basic medicine
gdc.oaire.sciencefields 03 medical and health sciences
gdc.oaire.views 140
gdc.openalex.collaboration International
gdc.openalex.fwci 2.78392516
gdc.openalex.normalizedpercentile 0.89
gdc.openalex.toppercent TOP 10%
gdc.opencitations.count 13
gdc.plumx.crossrefcites 8
gdc.plumx.mendeley 69
gdc.plumx.pubmedcites 10
gdc.plumx.scopuscites 13
gdc.scopus.citedcount 13
gdc.virtual.author Aydın, Zafer
gdc.wos.citedcount 13
relation.isAuthorOfPublication a26c06af-eae3-407c-a21a-128459fa4d2f
relation.isAuthorOfPublication.latestForDiscovery a26c06af-eae3-407c-a21a-128459fa4d2f
relation.isOrgUnitOfPublication 665d3039-05f8-4a25-9a3c-b9550bffecef
relation.isOrgUnitOfPublication 52f507ab-f278-4a1f-824c-44da2a86bd51
relation.isOrgUnitOfPublication ef13a800-4c99-4124-81e0-3e25b33c0c2b
relation.isOrgUnitOfPublication.latestForDiscovery 665d3039-05f8-4a25-9a3c-b9550bffecef

Files