Comparative Analysis of Machine Learning Approaches for Predicting Respiratory Virus Infection and Symptom Severity

dc.contributor.author Isik, Yunus Emre
dc.contributor.author Aydin, Zafer
dc.date.accessioned 2025-09-25T10:42:54Z
dc.date.available 2025-09-25T10:42:54Z
dc.date.issued 2023
dc.description Isik, Yunus Emre/0000-0001-6176-7545 en_US
dc.description.abstract Respiratory diseases are among the major health problems causing a burden on hospitals. Diagnosis of infection and rapid prediction of severity without time-consuming clinical tests could be beneficial in preventing the spread and progression of the disease, especially in countries where health systems remain incapable. Personalized medicine studies involving statistics and computer technologies could help to address this need. In addition to individual studies, competitions are also held such as Dialogue for Reverse Engineering Assessment and Methods (DREAM) challenge which is a community-driven organization with a mission to research biology, bioinformatics, and biomedicine. One of these competitions was the Respiratory Viral DREAM Challenge, which aimed to develop early predictive biomarkers for respiratory virus infections. These efforts are promising, however, the prediction performance of the computational methods developed for detecting respiratory diseases still has room for improvement. In this study, we focused on improving the performance of predicting the infection and symptom severity of individuals infected with various respiratory viruses using gene expression data collected before and after exposure. The publicly available gene expression dataset in the Gene Expression Omnibus, named GSE73072, containing samples exposed to four respiratory viruses (H1N1, H3N2, human rhinovirus (HRV), and respiratory syncytial virus (RSV)) was used as input data. Various preprocessing methods and machine learning algorithms were implemented and compared to achieve the best prediction performance. The experimental results showed that the proposed approaches obtained a prediction performance of 0.9746 area under the precision-recall curve (AUPRC) for infection (i.e., shedding) prediction (SC-1), 0.9182 AUPRC for symptom class prediction (SC-2), and 0.6733 Pearson correlation for symptom score prediction (SC-3) by outperforming the best leaderboard scores of Respiratory Viral DREAM Challenge (a 4.48% improvement for SC-1, a 13.68% improvement for SC-2, and a 13.98% improvement for SC-3). Additionally, over-representation analysis (ORA), which is a statistical method for objectively determining whether certain genes are more prevalent in pre-defined sets such as pathways, was applied using the most significant genes selected by feature selection methods. The results show that pathways associated with the 'adaptive immune system' and 'immune disease' are strongly linked to pre-infection and symptom development. These findings contribute to our knowledge about predicting respiratory infections and are expected to facilitate the development of future studies that concentrate on predicting not only infections but also the associated symptoms. en_US
dc.identifier.doi 10.7717/peerj.15552
dc.identifier.issn 2167-8359
dc.identifier.scopus 2-s2.0-85165150953
dc.identifier.uri https://doi.org/10.7717/peerj.15552
dc.identifier.uri https://hdl.handle.net/20.500.12573/3485
dc.language.iso en en_US
dc.publisher PeerJ Inc en_US
dc.relation.ispartof PeerJ en_US
dc.rights info:eu-repo/semantics/openAccess en_US
dc.subject Biology and Genetics en_US
dc.subject Feature Evaluation and Selection en_US
dc.subject Machine Learning en_US
dc.subject Pathway Analysis en_US
dc.subject Respiratory Infection Prediction en_US
dc.title Comparative Analysis of Machine Learning Approaches for Predicting Respiratory Virus Infection and Symptom Severity en_US
dc.type Article en_US
dspace.entity.type Publication
gdc.author.id Isik, Yunus Emre/0000-0001-6176-7545
gdc.author.scopusid 57195215625
gdc.author.scopusid 7003852510
gdc.author.wosid Işik, Yunus/Jep-8357-2023
gdc.bip.impulseclass C5
gdc.bip.influenceclass C5
gdc.bip.popularityclass C4
gdc.coar.access open access
gdc.coar.type text::journal::journal article
gdc.collaboration.industrial false
gdc.description.department Abdullah Gül University en_US
gdc.description.departmenttemp [Isik, Yunus Emre] Sivas Cumhuriyet Univ, Dept Management Informat Syst, Sivas, Turkiye; [Aydin, Zafer] Abdullah Gul Univ, Dept Comp Engn, Kayseri, Turkiye en_US
gdc.description.publicationcategory Makale - Uluslararası Hakemli Dergi - Kurum Öğretim Elemanı en_US
gdc.description.scopusquality Q3
gdc.description.startpage e15552
gdc.description.volume 11 en_US
gdc.description.woscitationindex Science Citation Index Expanded
gdc.description.wosquality Q2
gdc.identifier.openalex W4382790092
gdc.identifier.pmid 37404475
gdc.identifier.wos WOS:001024629300004
gdc.index.type WoS
gdc.index.type Scopus
gdc.index.type PubMed
gdc.oaire.accesstype GOLD
gdc.oaire.diamondjournal false
gdc.oaire.downloads 56
gdc.oaire.impulse 3.0
gdc.oaire.influence 2.5269091E-9
gdc.oaire.isgreen true
gdc.oaire.keywords Pathway analysis
gdc.oaire.keywords QH301-705.5
gdc.oaire.keywords Bioinformatics
gdc.oaire.keywords Influenza A Virus, H3N2 Subtype
gdc.oaire.keywords R
gdc.oaire.keywords Feature evaluation and selection
gdc.oaire.keywords Machine Learning
gdc.oaire.keywords Influenza A Virus, H1N1 Subtype
gdc.oaire.keywords Virus Diseases
gdc.oaire.keywords Respiratory infection prediction
gdc.oaire.keywords Respiratory Syncytial Virus, Human
gdc.oaire.keywords Feature evaluation and selection,
gdc.oaire.keywords Biology and genetics
gdc.oaire.keywords Machine learning
gdc.oaire.keywords Medicine
gdc.oaire.keywords Humans
gdc.oaire.keywords Biology (General)
gdc.oaire.popularity 4.2831476E-9
gdc.oaire.publicfunded false
gdc.oaire.views 106
gdc.openalex.collaboration National
gdc.openalex.fwci 1.76527198
gdc.openalex.normalizedpercentile 0.79
gdc.opencitations.count 2
gdc.plumx.mendeley 21
gdc.plumx.newscount 1
gdc.plumx.pubmedcites 1
gdc.plumx.scopuscites 4
gdc.scopus.citedcount 4
gdc.virtual.author Aydın, Zafer
gdc.wos.citedcount 2
relation.isAuthorOfPublication a26c06af-eae3-407c-a21a-128459fa4d2f
relation.isAuthorOfPublication.latestForDiscovery a26c06af-eae3-407c-a21a-128459fa4d2f
relation.isOrgUnitOfPublication 665d3039-05f8-4a25-9a3c-b9550bffecef
relation.isOrgUnitOfPublication 52f507ab-f278-4a1f-824c-44da2a86bd51
relation.isOrgUnitOfPublication ef13a800-4c99-4124-81e0-3e25b33c0c2b
relation.isOrgUnitOfPublication.latestForDiscovery 665d3039-05f8-4a25-9a3c-b9550bffecef

Files