Comparative Analysis of Machine Learning Approaches for Predicting Respiratory Virus Infection and Symptom Severity

No Thumbnail Available

Date

2023

Journal Title

Journal ISSN

Volume Title

Publisher

PeerJ Inc

Open Access Color

GOLD

Green Open Access

Yes

OpenAIRE Downloads

56

OpenAIRE Views

106

Publicly Funded

No
Impulse
Average
Influence
Average
Popularity
Top 10%

Research Projects

Journal Issue

Abstract

Respiratory diseases are among the major health problems causing a burden on hospitals. Diagnosis of infection and rapid prediction of severity without time-consuming clinical tests could be beneficial in preventing the spread and progression of the disease, especially in countries where health systems remain incapable. Personalized medicine studies involving statistics and computer technologies could help to address this need. In addition to individual studies, competitions are also held such as Dialogue for Reverse Engineering Assessment and Methods (DREAM) challenge which is a community-driven organization with a mission to research biology, bioinformatics, and biomedicine. One of these competitions was the Respiratory Viral DREAM Challenge, which aimed to develop early predictive biomarkers for respiratory virus infections. These efforts are promising, however, the prediction performance of the computational methods developed for detecting respiratory diseases still has room for improvement. In this study, we focused on improving the performance of predicting the infection and symptom severity of individuals infected with various respiratory viruses using gene expression data collected before and after exposure. The publicly available gene expression dataset in the Gene Expression Omnibus, named GSE73072, containing samples exposed to four respiratory viruses (H1N1, H3N2, human rhinovirus (HRV), and respiratory syncytial virus (RSV)) was used as input data. Various preprocessing methods and machine learning algorithms were implemented and compared to achieve the best prediction performance. The experimental results showed that the proposed approaches obtained a prediction performance of 0.9746 area under the precision-recall curve (AUPRC) for infection (i.e., shedding) prediction (SC-1), 0.9182 AUPRC for symptom class prediction (SC-2), and 0.6733 Pearson correlation for symptom score prediction (SC-3) by outperforming the best leaderboard scores of Respiratory Viral DREAM Challenge (a 4.48% improvement for SC-1, a 13.68% improvement for SC-2, and a 13.98% improvement for SC-3). Additionally, over-representation analysis (ORA), which is a statistical method for objectively determining whether certain genes are more prevalent in pre-defined sets such as pathways, was applied using the most significant genes selected by feature selection methods. The results show that pathways associated with the 'adaptive immune system' and 'immune disease' are strongly linked to pre-infection and symptom development. These findings contribute to our knowledge about predicting respiratory infections and are expected to facilitate the development of future studies that concentrate on predicting not only infections but also the associated symptoms.

Description

Isik, Yunus Emre/0000-0001-6176-7545

Keywords

Biology and Genetics, Feature Evaluation and Selection, Machine Learning, Pathway Analysis, Respiratory Infection Prediction, Pathway analysis, QH301-705.5, Bioinformatics, Influenza A Virus, H3N2 Subtype, R, Feature evaluation and selection, Machine Learning, Influenza A Virus, H1N1 Subtype, Virus Diseases, Respiratory infection prediction, Respiratory Syncytial Virus, Human, Feature evaluation and selection,, Biology and genetics, Machine learning, Medicine, Humans, Biology (General)

Turkish CoHE Thesis Center URL

Fields of Science

Citation

WoS Q

Q2

Scopus Q

Q3
OpenCitations Logo
OpenCitations Citation Count
2

Source

PeerJ

Volume

11

Issue

Start Page

e15552

End Page

PlumX Metrics
Citations

Scopus : 4

PubMed : 1

Captures

Mendeley Readers : 21

SCOPUS™ Citations

4

checked on Feb 03, 2026

Web of Science™ Citations

2

checked on Feb 03, 2026

Page Views

2

checked on Feb 03, 2026

Google Scholar Logo
Google Scholar™
OpenAlex Logo
OpenAlex FWCI
1.76527198
Altmetrics Badge

Sustainable Development Goals

3

GOOD HEALTH AND WELL-BEING
GOOD HEALTH AND WELL-BEING Logo