Discovering Potential Taxonomic Biomarkers of Type 2 Diabetes From Human Gut Microbiota via Different Feature Selection Methods

dc.contributor.author Bakir-Gungor, Burcu
dc.contributor.author Bulut, Osman
dc.contributor.author Jabeer, Amhar
dc.contributor.author Nalbantoglu, O. Ufuk
dc.contributor.author Yousef, Malik
dc.date.accessioned 2025-09-25T10:44:49Z
dc.date.available 2025-09-25T10:44:49Z
dc.date.issued 2021
dc.description.abstract Human gut microbiota is a complex community of organisms including trillions of bacteria. While these microorganisms are considered as essential regulators of our immune system, some of them can cause several diseases. In recent years, next-generation sequencing technologies accelerated the discovery of human gut microbiota. In this respect, the use of machine learning techniques became popular to analyze disease-associated metagenomics datasets. Type 2 diabetes (T2D) is a chronic disease and affects millions of people around the world. Since the early diagnosis in T2D is important for effective treatment, there is an utmost need to develop a classification technique that can accelerate T2D diagnosis. In this study, using T2D-associated metagenomics data, we aim to develop a classification model to facilitate T2D diagnosis and to discover T2D-associated biomarkers. The sequencing data of T2D patients and healthy individuals were taken from a metagenome-wide association study and categorized into disease states. The sequencing reads were assigned to taxa, and the identified species are used to train and test our model. To deal with the high dimensionality of features, we applied robust feature selection algorithms such as Conditional Mutual Information Maximization, Maximum Relevance and Minimum Redundancy, Correlation Based Feature Selection, and select K best approach. To test the performance of the classification based on the features that are selected by different methods, we used random forest classifier with 100-fold Monte Carlo cross-validation. In our experiments, we observed that 15 commonly selected features have a considerable effect in terms of minimizing the microbiota used for the diagnosis of T2D and thus reducing the time and cost. When we perform biological validation of these identified species, we found that some of them are known as related to T2D development mechanisms and we identified additional species as potential biomarkers. Additionally, we attempted to find the subgroups of T2D patients using k-means clustering. In summary, this study utilizes several supervised and unsupervised machine learning algorithms to increase the diagnostic accuracy of T2D, investigates potential biomarkers of T2D, and finds out which subset of microbiota is more informative than other taxa by applying state-of-the art feature selection methods.</p> en_US
dc.description.sponsorship Abdullah Gul University Support Foundation (AGUV); Zefat Academic College en_US
dc.description.sponsorship The work of BB-G has been supported by the Abdullah Gul University Support Foundation (AGUV). The work of MY has been supported by the Zefat Academic College. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript. en_US
dc.identifier.doi 10.3389/fmicb.2021.628426
dc.identifier.issn 1664-302X
dc.identifier.scopus 2-s2.0-85114729153
dc.identifier.uri https://doi.org/10.3389/fmicb.2021.628426
dc.identifier.uri https://hdl.handle.net/20.500.12573/3634
dc.language.iso en en_US
dc.publisher Frontiers Media S.A. en_US
dc.relation.ispartof Frontiers in Microbiology en_US
dc.rights info:eu-repo/semantics/openAccess en_US
dc.subject Feature Selection en_US
dc.subject Metagenomic Analysis en_US
dc.subject Classification en_US
dc.subject Machine Learning en_US
dc.subject Type 2 Diabetes en_US
dc.subject Human Gut Microbiome en_US
dc.title Discovering Potential Taxonomic Biomarkers of Type 2 Diabetes From Human Gut Microbiota via Different Feature Selection Methods en_US
dc.type Article en_US
dspace.entity.type Publication
gdc.author.scopusid 25932029800
gdc.author.scopusid 57255576400
gdc.author.scopusid 57221663697
gdc.author.scopusid 36117887000
gdc.author.scopusid 14029389000
gdc.author.wosid Nalbantoglu, Ufuk/Aaa-8033-2022
gdc.bip.impulseclass C4
gdc.bip.influenceclass C5
gdc.bip.popularityclass C4
gdc.coar.access open access
gdc.coar.type text::journal::journal article
gdc.collaboration.industrial false
gdc.description.department Abdullah Gül University en_US
gdc.description.departmenttemp [Bakir-Gungor, Burcu; Bulut, Osman; Jabeer, Amhar] Abdullah Gul Univ, Fac Engn, Dept Comp Engn, Kayseri, Turkey; [Nalbantoglu, O. Ufuk] Erciyes Univ, Dept Comp Engn, Genome & Stem Cell Ctr, Kayseri, Turkey; [Yousef, Malik] Zefat Acad Coll, Dept Informat Syst, Safed, Israel; [Yousef, Malik] Zefat Acad Coll, Galilee Digital Hlth Res Ctr, Safed, Israel en_US
gdc.description.publicationcategory Makale - Uluslararası Hakemli Dergi - Kurum Öğretim Elemanı en_US
gdc.description.scopusquality Q1
gdc.description.volume 12 en_US
gdc.description.woscitationindex Science Citation Index Expanded
gdc.description.wosquality Q1
gdc.identifier.openalex W3195148424
gdc.identifier.pmid 34512559
gdc.identifier.wos WOS:000698805300001
gdc.index.type WoS
gdc.index.type Scopus
gdc.index.type PubMed
gdc.oaire.accesstype GOLD
gdc.oaire.diamondjournal false
gdc.oaire.downloads 82
gdc.oaire.impulse 17.0
gdc.oaire.influence 3.0715608E-9
gdc.oaire.isgreen true
gdc.oaire.keywords Microbiology (medical)
gdc.oaire.keywords human gut microbiome
gdc.oaire.keywords feature selection
gdc.oaire.keywords machine learning
gdc.oaire.keywords classification
gdc.oaire.keywords metagenomic analysis
gdc.oaire.keywords type 2 diabetes
gdc.oaire.keywords Microbiology
gdc.oaire.keywords QR1-502
gdc.oaire.popularity 2.0704606E-8
gdc.oaire.publicfunded false
gdc.oaire.sciencefields 0301 basic medicine
gdc.oaire.sciencefields 03 medical and health sciences
gdc.oaire.views 189
gdc.openalex.collaboration International
gdc.openalex.fwci 1.9306
gdc.openalex.normalizedpercentile 0.87
gdc.opencitations.count 21
gdc.plumx.mendeley 57
gdc.plumx.pubmedcites 16
gdc.plumx.scopuscites 28
gdc.scopus.citedcount 28
gdc.virtual.author Güngör, Burcu
gdc.wos.citedcount 23
relation.isAuthorOfPublication e17be1f8-1c9a-45f2-bf0d-f8b348d2dba0
relation.isAuthorOfPublication.latestForDiscovery e17be1f8-1c9a-45f2-bf0d-f8b348d2dba0
relation.isOrgUnitOfPublication 665d3039-05f8-4a25-9a3c-b9550bffecef
relation.isOrgUnitOfPublication 52f507ab-f278-4a1f-824c-44da2a86bd51
relation.isOrgUnitOfPublication ef13a800-4c99-4124-81e0-3e25b33c0c2b
relation.isOrgUnitOfPublication.latestForDiscovery 665d3039-05f8-4a25-9a3c-b9550bffecef

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Discovering Potential Taxonomic Biomarkers of Type 2 Diabetes From Human Gut Microbiota via Different Feature Selection Methods.pdf
Size:
2.13 MB
Format:
Adobe Portable Document Format
Description:
Makale Dosyası

License bundle

Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
1.44 KB
Format:
Item-specific license agreed upon to submission
Description: