Comparative analysis of machine learning approaches for predicting respiratory virus infection and symptom severity

dc.contributor.author Aydin, Zafer
dc.contributor.author Isik, Yunus Emre
dc.contributor.authorID 0000-0001-7686-6298 en_US
dc.contributor.department AGÜ, Mühendislik Fakültesi, Bilgisayar Mühendisliği Bölümü en_US
dc.contributor.institutionauthor Aydin, Zafer
dc.date.accessioned 2023-08-21T07:04:38Z
dc.date.available 2023-08-21T07:04:38Z
dc.date.issued 2023 en_US
dc.description.abstract Respiratory diseases are among the major health problems causing a burden on hospitals. Diagnosis of infection and rapid prediction of severity without time-consuming clinical tests could be beneficial in preventing the spread and progression of the disease, especially in countries where health systems remain incapable. Personalized medicine studies involving statistics and computer technologies could help to address this need. In addition to individual studies, competitions are also held such as Dialogue for Reverse Engineering Assessment and Methods (DREAM) challenge which is a community-driven organization with a mission to research biology, bioinformatics, and biomedicine. One of these competitions was the Respiratory Viral DREAM Challenge, which aimed to develop early predictive biomarkers for respiratory virus infections. These efforts are promising, however, the prediction performance of the computational methods developed for detecting respiratory diseases still has room for improvement. In this study, we focused on improving the performance of predicting the infection and symptom severity of individuals infected with various respiratory viruses using gene expression data collected before and after exposure. The publicly available gene expression dataset in the Gene Expression Omnibus, named GSE73072, containing samples exposed to four respiratory viruses (H1N1, H3N2, human rhinovirus (HRV), and respiratory syncytial virus (RSV)) was used as input data. Various preprocessing methods and machine learning algorithms were implemented and compared to achieve the best prediction performance. The experimental results showed that the proposed approaches obtained a prediction performance of 0.9746 area under the precision-recall curve (AUPRC) for infection (i.e., shedding) prediction (SC-1), 0.9182 AUPRC for symptom class prediction (SC-2), and 0.6733 Pearson correlation for symptom score prediction (SC-3) by outperforming the best leaderboard scores of Respiratory Viral DREAM Challenge (a 4.48% improvement for SC-1, a 13.68% improvement for SC-2, and a 13.98% improvement for SC-3). Additionally, over-representation analysis (ORA), which is a statistical method for objectively determining whether certain genes are more prevalent in pre-defined sets such as pathways, was applied using the most significant genes selected by feature selection methods. The results show that pathways associated with the ‘adaptive immune system’ and ‘immune disease’ are strongly linked to pre-infection and symptom development. These findings contribute to our knowledge about predicting respiratory infections and are expected to facilitate the development of future studies that concentrate on predicting not only infections but also the associated symptoms. en_US
dc.identifier.endpage 26 en_US
dc.identifier.issn 2167-8359
dc.identifier.issue e15552 en_US
dc.identifier.other WOS:001024629300004
dc.identifier.startpage 1 en_US
dc.identifier.uri http://dx.doi.org/10.7717/peerj.15552
dc.identifier.uri https://hdl.handle.net/20.500.12573/1750
dc.identifier.volume 11 en_US
dc.language.iso eng en_US
dc.publisher PEERJ INC en_US
dc.relation.isversionof 10.7717/peerj.15552 en_US
dc.relation.journal PEERJ en_US
dc.relation.publicationcategory Makale - Uluslararası Hakemli Dergi - Kurum Öğretim Elemanı en_US
dc.rights info:eu-repo/semantics/openAccess en_US
dc.subject Biology and genetics en_US
dc.subject Feature evaluation and selection, en_US
dc.subject Machine learning en_US
dc.subject Pathway analysis en_US
dc.subject Respiratory infection prediction en_US
dc.title Comparative analysis of machine learning approaches for predicting respiratory virus infection and symptom severity en_US
dc.type article en_US

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
peerj-15552.pdf
Size:
4.2 MB
Format:
Adobe Portable Document Format
Description:
Makale Dosyası

License bundle

Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
1.44 KB
Format:
Item-specific license agreed upon to submission
Description: