Scopus İndeksli Yayınlar Koleksiyonu
Permanent URI for this collectionhttps://hdl.handle.net/20.500.12573/395
Browse
Search Results
Article Citation - WoS: 3Citation - Scopus: 3Machine Learning-Based Prediction of Autism Spectrum Disorder and Discovery of Related Metagenomic Biomarkers With Explainable AI(MDPI, 2025-08-21) Temiz, Mustafa; Bakir-Gungor, Burcu; Ersoz, Nur Sebnem; Yousef, MalikBackground: Autism spectrum disorder (ASD) is a complex neurodevelopmental disorder characterized by social communication deficits and repetitive behaviors. Recent studies have suggested that gut microbiota may play a role in the pathophysiology of ASD. This study aims to develop a classification model for ASD diagnosis and to identify ASD-associated biomarkers by analyzing metagenomic data at the taxonomic level. Methods: The performances of five different methods were tested in this study. These methods are (i) SVM-RCE, (ii) RCE-IFE, (iii) microBiomeGSM, (iv) different feature selection methods, and (v) a union method. The last method is based on creating a union feature set consisting of the features with importance scores greater than 0.5, identified using the best-performing feature selection methods. Results: In our 10-fold Monte Carlo cross-validation experiments on ASD-associated metagenomic data, the most effective performance metric (an AUC of 0.99) was obtained using the union feature set (17 features) and the AdaBoost classifier. In other words, we achieve superior machine learning performance with a few features. Additionally, the SHAP method, which is an explainable artificial intelligence method, is applied to the union feature set, and Prevotella sp. 109 is identified as the most important microorganism for ASD development. Conclusions: These findings suggest that the proposed method may be a promising approach for uncovering microbial patterns associated with ASD and may inform future research in this area. This study should be regarded as exploratory, based on preliminary findings and hypothesis generation.Article Integrating Biological Domain Knowledge With Machine Learning for Identifying Colorectal-Cancer Microbial Enzymes in Metagenomic Data(MDPI, 2025-03-08) Bakir-Gungor, Burcu; Ersoz, Nur Sebnem; Yousef, MalikAdvances in metagenomics have revolutionized our ability to elucidate links between the microbiome and human diseases. Colorectal cancer (CRC), a leading cause of cancer-related mortality worldwide, has been associated with dysbiosis of the gut microbiome. This study aims to develop a method for identifying CRC-associated microbial enzymes by incorporating biological domain knowledge into the feature selection process. Conventional feature selection techniques often evaluate features individually and fail to leverage biological knowledge during metagenomic data analysis. To address this gap, we propose the enzyme commission (EC)-nomenclature-based Grouping-Scoring-Modeling (G-S-M) method, which integrates biological domain knowledge into feature grouping and selection. The proposed method was tested on a CRC-associated metagenomic dataset collected from eight different countries. Community-level relative abundance values of enzymes were considered as features and grouped based on their EC categories to provide biologically informed groupings. Our findings in randomized 10-fold cross-validation experiments imply that glycosidases, CoA-transferases, hydro-lyases, oligo-1,6-glucosidase, crotonobetainyl-CoA hydratase, and citrate CoA-transferase enzymes can be associated with CRC development as part of different molecular pathways. These enzymes are mostly synthesized by Eschericia coli, Salmonella enterica, Klebsiella pneumoniae, Staphylococcus aureus, Streptococcus pneumoniae, and Clostridioides dificile. Comparative evaluation experiments showed that the proposed model consistently outperforms traditional feature selection methods paired with various classifiers.
