Scopus İndeksli Yayınlar Koleksiyonu
Permanent URI for this collectionhttps://hdl.handle.net/20.500.12573/395
Browse
Search Results
Article G-S a Prior Biological Knowledge-Based Pattern Detection and Enrichment Framework for Multi-Omics Data Integration(MDPI, 2025-11-29) Unlu Yazici, Miray; Bakir-Gungor, Burcu; Yousef, MalikThe rapid advancements in high-throughput technologies have led to a dramatic increase in diverse -omics data types, enabling comprehensive analyses, especially for complex diseases like cancer. Despite the development of multi-omics approaches, the challenges of scaling integration to massive, heterogeneous -omics datasets suggest that novel computational tools need to be designed. In this study, we propose an approach for integrating microRNA (miRNA) and messenger RNA (mRNA) expression data, incorporating prior biological knowledge (PBK). This approach scores and ranks groups of miRNAs and their associated genes using cross-validation iterations. The proposed method incorporates a Pattern detection (P) component to identify molecular motifs unique to each biological group. The analysis also facilitates the visualization of the groups, facilitating the identification of co-occurring groups and their characteristic features across iterations. Furthermore, the groups are scored using an over-representation analysis through a new Enrichment (E) component in each iteration. The clusters of the groups based on the Enrichment Scores (ESs) are visualized in a heatmap to obtain novel insights into the collective behavior and dependencies of the groups, aiming to understand the molecular mechanisms of complex diseases. The developed G-S-M-E tool not only provides performance metrics and biological scores at the group level but also offers comprehensive insights into intricate multi-omics interactions. In summary, our study emphasizes the importance of mathematical and data science methodologies in elucidating intricate multi-omics integration, yielding a formalized approach that deepens our comprehension of complex diseases.Conference Object Enhancing Complex Disease Group Scoring with Mirgedinet: A Multi-Algorithm Machine Learning Framework Based on the GSM Approach(IEEE, 2025-06-25) Qumsiyeh, Emma; Bakir-Gungor, Burcu; Yousef, MalikIntegrating biological prior knowledge for disease gene associations has shown significant promise in discovering new biomarkers with potential translational applications. This work investigates the application of a multi-algorithm machine learning framework based on the Grouping-Scoring-Modeling (G-S-M) approach for improving the prediction of complex diseases. The study identifies the primary gene and miRNA interactions in various complex diseases with the help of miRGediNET, which is a machine-learning based tool that integrates data from three biological databases. Traditional methods have only focused on independence between features; the G-S-M method focuses on aggregating genes based on biological interactions, pinpointing the scoring of gene groups for a disease, and modeling its predictive capability using advanced machine learning algorithms. In this research paper, seven algorithms, including Support Vector Machine, Decision Tree, and CatBoost, were applied to eight datasets extracted from the GEO database. This framework proved very robust in ranking gene clusters, thus predicting critical biomarkers while doing 100-fold randomized cross-validation within the evaluation. The results indicate this approach's high potential for refining disease and supporting research for choosing the best algorithm that can provide biological insights and computational advances.Conference Object Exploring Microbiome Signatures in Autism Spectrum Disorder via Grouping-Scoring Based Machine Learning(IEEE, 2025-06-25) Temiz, Mustafa; Ersoz, Nur Sebnem; Yousef, Malik; Bakir-Gungor, BurcuThe rapid increase in omic data production increased the importance of machine learning (ML) methods to analze these data. In particular, the use of metagenomic data in the diagnosis, prognosis and treatment of diseases is becoming widespread. Autism Spectrum Disorder (ASD) is a neurodevelopmental disease that occurs in early childhood and continues lifelong. The aim of this study is to increase ML performance, reduce computational costs and achieve successful classification performance using a small number of metagenomic features. In addition, disease prediction is performed; ASD associated biomarkers are determined using the microBiomeGSM on metagenomic data. Classification is performed at three different taxonomic levels (genus, family and order) using the relative abundance values of species. The best performance metric (0.95 AUC) was obtained at the order taxonomic level using an average of 416 features with microBiomeGSM. The identified ASD-related taxonomic species are presented.Conference Object TextNetTopics+: Enhancing Text Classification Through Classifier Diversity and Model Ensembling(Springer International Publishing AG, 2025) Voskergian, Daniel; Bakir-Gungor, Burcu; Yousef, MalikTextNetTopics is an innovative text classification framework that integrates topic modeling with feature selection to improve model accuracy and interpretability. Unlike traditional methods that rely on individual words, TextNetTopics selects cohesive topics extracted via Latent Dirichlet Allocation as features for document representation, effectively reducing dimensionality while preserving the semantic structure of the text. This study evaluates the performance of TextNetTopics utilizing multiple machine learning algorithms in the M (Modeling) component, including Random Forest, Support Vector Machine, Gradient Boosting, eXtreme Gradient Boosting, and Logistic Regression. To further enhance classification performance, we introduce TextNetTopics+, an ensemblebased extension that leverages both hard voting and soft voting mechanisms to combine the strengths of multiple classifiers. Comprehensive experiments on the LitCovid and WOS datasets demonstrate that ensemble learning in TextNetTopics + significantly outperforms individual classifiers in TextNetTopics, confirming its effectiveness in improving model robustness and generalization.Conference Object Leveraging MicroRNA-Gene Associations With Mirgedinet: An Intelligent Approach for Enhanced Classification of Breast Cancer Molecular Subtypes(Springer International Publishing AG, 2025) Qumsiyeh, Emma; Bakir-Gungor, Burcu; Yousef, MalikUnderstanding the molecular subtypes of breast cancer is crucial for advancing targeted therapies and precision medicine. For the BRCA molecular subtype prediction problem, this study employs miRGediNET, a machinelearning approach that integrates data from miRTarBase, DisGeNET, and HMDD databases to investigate shared gene associations between microRNA (miRNA) activity and disease mechanisms. Using the BRCA LumAB_Her2Basal dataset, we evaluate miRGediNET's performance against traditional feature selection methods, including CMIM, mRmR, Information Gain (IG), SelectKBest (SKB), Fast Correlation-Based Filter (FCBF), and XGBoost (XGB). These feature selection techniques were assessed using various classification algorithms including Random Forest (RF), Support Vector Machine (SVM), LogitBoost, Decision Tree, and AdaBoost, all executed with default parameters. The feature selection methods were tested using Monte Carlo Cross-Validation, where performance metrics obtained for each iteration were averaged to ensure robustness. Our findings reveal that miRGediNET outperforms traditional methods in accuracy and Area Under the Curve (AUC), emphasizing its superior capability to identify key genes that bridge miRNA interactions and breast cancer mechanisms. Notably, both miRGediNET and Information Gain (IG) feature selection consistently identified ESR1, a critical biomarker frequently reported in recent research associated with breast cancer prognosis and resistance to endocrine therapies. This integrative approach provides deeper biological insights into miRNA-disease interactions, paving the way for enhanced patient stratification, biomarker discovery, and personalized medicine strategies. The miRGediNET tool, developed on the KNIME platform, offers a practical resource for further exploration in the field of bioinformatics and oncology.
