PubMed İndeksli Yayınlar Koleksiyonu
Permanent URI for this collectionhttps://hdl.handle.net/20.500.12573/397
Browse
3 results
Search Results
Article Topological Feature Generation for Link Prediction in Biological Networks(PeerJ Inc, 2023-05-09) Temiz, Mustafa; Bakir-Gungor, Burcu; Sahan, Pinar Guner; Coskun, Mustafa; Güner Şahan, PınarGraph or network embedding is a powerful method for extracting missing or potential information from interactions between nodes in biological networks. Graph embedding methods learn representations of nodes and interactions in a graph with low-dimensional vectors, which facilitates research to predict potential interactions in networks. However, most graph embedding methods suffer from high computational costs in the form of high computational complexity of the embedding methods and learning times of the classifier, as well as the high dimensionality of complex biological networks. To address these challenges, in this study, we use the Chopper algorithm as an alternative approach to graph embedding, which accelerates the iterative processes and thus reduces the running time of the iterative algorithms for three different (nervous system, blood, heart) undirected protein-protein interaction (PPI) networks. Due to the high dimensionality of the matrix obtained after the embedding process, the data are transformed into a smaller representation by applying feature regularization techniques. We evaluated the performance of the proposed method by comparing it with state-of-the-art methods. Extensive experiments demonstrate that the proposed approach reduces the learning time of the classifier and performs better in link prediction. We have also shown that the proposed embedding method is faster than state-of-the-art methods on three different PPI datasets.Article Citation - WoS: 4Citation - Scopus: 5Comparative Analysis of Machine Learning Approaches for Predicting Respiratory Virus Infection and Symptom Severity(PeerJ Inc, 2023-06-30) Isik, Yunus Emre; Aydin, ZaferRespiratory diseases are among the major health problems causing a burden on hospitals. Diagnosis of infection and rapid prediction of severity without time-consuming clinical tests could be beneficial in preventing the spread and progression of the disease, especially in countries where health systems remain incapable. Personalized medicine studies involving statistics and computer technologies could help to address this need. In addition to individual studies, competitions are also held such as Dialogue for Reverse Engineering Assessment and Methods (DREAM) challenge which is a community-driven organization with a mission to research biology, bioinformatics, and biomedicine. One of these competitions was the Respiratory Viral DREAM Challenge, which aimed to develop early predictive biomarkers for respiratory virus infections. These efforts are promising, however, the prediction performance of the computational methods developed for detecting respiratory diseases still has room for improvement. In this study, we focused on improving the performance of predicting the infection and symptom severity of individuals infected with various respiratory viruses using gene expression data collected before and after exposure. The publicly available gene expression dataset in the Gene Expression Omnibus, named GSE73072, containing samples exposed to four respiratory viruses (H1N1, H3N2, human rhinovirus (HRV), and respiratory syncytial virus (RSV)) was used as input data. Various preprocessing methods and machine learning algorithms were implemented and compared to achieve the best prediction performance. The experimental results showed that the proposed approaches obtained a prediction performance of 0.9746 area under the precision-recall curve (AUPRC) for infection (i.e., shedding) prediction (SC-1), 0.9182 AUPRC for symptom class prediction (SC-2), and 0.6733 Pearson correlation for symptom score prediction (SC-3) by outperforming the best leaderboard scores of Respiratory Viral DREAM Challenge (a 4.48% improvement for SC-1, a 13.68% improvement for SC-2, and a 13.98% improvement for SC-3). Additionally, over-representation analysis (ORA), which is a statistical method for objectively determining whether certain genes are more prevalent in pre-defined sets such as pathways, was applied using the most significant genes selected by feature selection methods. The results show that pathways associated with the 'adaptive immune system' and 'immune disease' are strongly linked to pre-infection and symptom development. These findings contribute to our knowledge about predicting respiratory infections and are expected to facilitate the development of future studies that concentrate on predicting not only infections but also the associated symptoms.Article Citation - WoS: 4Citation - Scopus: 5CSA-DE-LR: Enhancing Cardiovascular Disease Diagnosis With a Novel Hybrid Machine Learning Approach(PeerJ Inc, 2024-07-18) Dedeturk, Beyhan Adanur; Dedeturk, Bilge Kagan; Bakir-Gungor, BurcuCardiovascular diseases (CVD) are a leading cause of mortality globally, necessitating the development of efficient diagnostic tools. Machine learning (ML) and metaheuristic algorithms have become prevalent in addressing these challenges, providing promising solutions in medical diagnostics. However, traditional ML approaches often need to be improved in feature selection and optimization, leading to suboptimal performance in complex diagnostic tasks. To overcome these limitations, this study introduces a new hybrid method called CSA-DE-LR, which combines the clonal selection algorithm (CSA) and differential evolution (DE) with logistic regression. This integration is designed to optimize logistic regression weights efficiently for the accurate classification of CVD. The methodology employs three optimization strategies based on the F1 score, the Matthews correlation coefficient (MCC), and the mean absolute error (MAE). Extensive evaluations on benchmark datasets, namely Cleveland and Statlog, reveal that CSA-DELR outperforms state-of-the-art ML methods. In addition, generalization is evaluated using the Breast Cancer Wisconsin Original (WBCO) and Breast Cancer Wisconsin Diagnostic (WBCD) datasets. Significantly, the proposed model demonstrates superior efficacy compared to previous research studies in this domain. This study's findings highlight the potential of hybrid machine learning approaches for improving diagnostic accuracy, offering a significant advancement in the fields of medical data analysis and CVD diagnosis.
