Machine Learning-Based Prediction of Autism Spectrum Disorder and Discovery of Related Metagenomic Biomarkers With Explainable AI

dc.contributor.author Temiz, Mustafa
dc.contributor.author Bakir-Gungor, Burcu
dc.contributor.author Ersoz, Nur Sebnem
dc.contributor.author Yousef, Malik
dc.date.accessioned 2025-09-25T10:50:32Z
dc.date.available 2025-09-25T10:50:32Z
dc.date.issued 2025
dc.description.abstract Background: Autism spectrum disorder (ASD) is a complex neurodevelopmental disorder characterized by social communication deficits and repetitive behaviors. Recent studies have suggested that gut microbiota may play a role in the pathophysiology of ASD. This study aims to develop a classification model for ASD diagnosis and to identify ASD-associated biomarkers by analyzing metagenomic data at the taxonomic level. Methods: The performances of five different methods were tested in this study. These methods are (i) SVM-RCE, (ii) RCE-IFE, (iii) microBiomeGSM, (iv) different feature selection methods, and (v) a union method. The last method is based on creating a union feature set consisting of the features with importance scores greater than 0.5, identified using the best-performing feature selection methods. Results: In our 10-fold Monte Carlo cross-validation experiments on ASD-associated metagenomic data, the most effective performance metric (an AUC of 0.99) was obtained using the union feature set (17 features) and the AdaBoost classifier. In other words, we achieve superior machine learning performance with a few features. Additionally, the SHAP method, which is an explainable artificial intelligence method, is applied to the union feature set, and Prevotella sp. 109 is identified as the most important microorganism for ASD development. Conclusions: These findings suggest that the proposed method may be a promising approach for uncovering microbial patterns associated with ASD and may inform future research in this area. This study should be regarded as exploratory, based on preliminary findings and hypothesis generation. en_US
dc.description.sponsorship Abdullah Gul University Support Foundation (AGUV); Zefat Academic College; TUBITAK 2211-A BIDEB program en_US
dc.description.sponsorship The work of B.B.G. has also been supported by the Abdullah Gul University Support Foundation (AGUV). B.B.G. would like to express her gratitude to the L'Oreal-UNESCO Young Women Scientist Program. The work of M.Y. has been supported by Zefat Academic College. The work of N.S.E. is supported by the TUBITAK 2211-A BIDEB program. en_US
dc.description.sponsorship Abdullah Gul University Support Foundation; Zefat Academic College; Türkiye Bilimsel ve Teknolojik Araştırma Kurumu, TUBITAK
dc.identifier.doi 10.3390/app15169214
dc.identifier.issn 2076-3417
dc.identifier.scopus 2-s2.0-105014480382
dc.identifier.uri https://doi.org/10.3390/app15169214
dc.identifier.uri https://hdl.handle.net/20.500.12573/4160
dc.language.iso en en_US
dc.publisher MDPI en_US
dc.relation.ispartof Applied Sciences-Basel en_US
dc.rights info:eu-repo/semantics/openAccess en_US
dc.subject Disease Prediction en_US
dc.subject Autism Spectrum Disorder en_US
dc.subject Metagenomics en_US
dc.subject Machine Learning en_US
dc.subject Biomarker Detection en_US
dc.subject Grouping Scoring Modeling (Gsm) Approach en_US
dc.subject Human Gut Microbiome en_US
dc.title Machine Learning-Based Prediction of Autism Spectrum Disorder and Discovery of Related Metagenomic Biomarkers With Explainable AI en_US
dc.type Article en_US
dspace.entity.type Publication
gdc.author.id Bakir-Gungor, Burcu/0000-0002-2272-6270
gdc.author.id Temiz, Mustafa/0000-0002-2839-1424
gdc.author.id Yousef, Malik/0000-0001-8780-6303
gdc.author.scopusid 57219794472
gdc.author.scopusid 25932029800
gdc.author.scopusid 57423006700
gdc.author.scopusid 14029389000
gdc.author.wosid Temiz, Mustafa/Kzu-4768-2024
gdc.bip.impulseclass C5
gdc.bip.influenceclass C5
gdc.bip.popularityclass C5
gdc.coar.access open access
gdc.coar.type text::journal::journal article
gdc.collaboration.industrial false
gdc.description.department Abdullah Gül University en_US
gdc.description.departmenttemp [Temiz, Mustafa] Sivas Cumhuriyet Univ, Fac Econ & Adm Sci, Dept Management Informat Syst, TR-58140 Sivas, Turkiye; [Bakir-Gungor, Burcu] Abdullah Gul Univ, Fac Engn, Dept Comp Engn, TR-38080 Kayseri, Turkiye; [Ersoz, Nur Sebnem] Abdullah Gul Univ, Grad Sch Engn & Sci, Dept Bioengn, TR-38080 Kayseri, Turkiye; [Yousef, Malik] Zefat Acad Coll, Dept Informat Syst, IL-1320611 Safed, Israel; [Yousef, Malik] Zefat Acad Coll, Galilee Digital Hlth Res Ctr GDH, IL-1320611 Safed, Israel en_US
gdc.description.issue 16 en_US
gdc.description.publicationcategory Makale - Uluslararası Hakemli Dergi - Kurum Öğretim Elemanı en_US
gdc.description.scopusquality Q2
gdc.description.startpage 9214
gdc.description.volume 15 en_US
gdc.description.woscitationindex Science Citation Index Expanded
gdc.description.wosquality Q2
gdc.identifier.openalex W4413430256
gdc.identifier.wos WOS:001557248400001
gdc.index.type WoS
gdc.index.type Scopus
gdc.oaire.accesstype GOLD
gdc.oaire.diamondjournal false
gdc.oaire.impulse 1.0
gdc.oaire.influence 2.5437168E-9
gdc.oaire.isgreen false
gdc.oaire.keywords metagenomics
gdc.oaire.keywords human gut microbiome
gdc.oaire.keywords machine learning
gdc.oaire.keywords autism spectrum disorder
gdc.oaire.keywords biomarker detection
gdc.oaire.keywords disease prediction
gdc.oaire.keywords grouping scoring modeling (GSM) approach
gdc.oaire.popularity 3.5061447E-9
gdc.oaire.publicfunded false
gdc.openalex.collaboration International
gdc.openalex.fwci 1.4694
gdc.openalex.normalizedpercentile 0.85
gdc.openalex.toppercent TOP 10%
gdc.opencitations.count 0
gdc.plumx.mendeley 8
gdc.plumx.newscount 1
gdc.plumx.scopuscites 0
gdc.scopus.citedcount 0
gdc.virtual.author Güngör, Burcu
gdc.wos.citedcount 0
relation.isAuthorOfPublication e17be1f8-1c9a-45f2-bf0d-f8b348d2dba0
relation.isAuthorOfPublication.latestForDiscovery e17be1f8-1c9a-45f2-bf0d-f8b348d2dba0
relation.isOrgUnitOfPublication 665d3039-05f8-4a25-9a3c-b9550bffecef
relation.isOrgUnitOfPublication 52f507ab-f278-4a1f-824c-44da2a86bd51
relation.isOrgUnitOfPublication ef13a800-4c99-4124-81e0-3e25b33c0c2b
relation.isOrgUnitOfPublication.latestForDiscovery 665d3039-05f8-4a25-9a3c-b9550bffecef

Files