Discovering Potential Taxonomic Biomarkers of Type 2 Diabetes From Human Gut Microbiota via Different Feature Selection Methods

dc.contributor.author Bakir-Gungor, Burcu
dc.contributor.author Bulut, Osman
dc.contributor.author Jabeer, Amhar
dc.contributor.author Nalbantoglu, O. Ufuk
dc.contributor.author Yousef, Malik
dc.contributor.department AGÜ, Mühendislik Fakültesi, Bilgisayar Mühendisliği Bölümü en_US
dc.contributor.institutionauthor Bakir-Gungor, Burcu
dc.contributor.institutionauthor Bulut, Osman
dc.contributor.institutionauthor Jabeer, Amhar
dc.date.accessioned 2022-02-17T06:45:43Z
dc.date.available 2022-02-17T06:45:43Z
dc.date.issued 2021 en_US
dc.description The work of BB-G has been supported by the Abdullah Gul University Support Foundation (AGUV). The work of MY has been supported by the Zefat Academic College. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript. en_US
dc.description.abstract Human gut microbiota is a complex community of organisms including trillions of bacteria. While these microorganisms are considered as essential regulators of our immune system, some of them can cause several diseases. In recent years, next-generation sequencing technologies accelerated the discovery of human gut microbiota. In this respect, the use of machine learning techniques became popular to analyze disease-associated metagenomics datasets. Type 2 diabetes (T2D) is a chronic disease and affects millions of people around the world. Since the early diagnosis in T2D is important for effective treatment, there is an utmost need to develop a classification technique that can accelerate T2D diagnosis. In this study, using T2D-associated metagenomics data, we aim to develop a classification model to facilitate T2D diagnosis and to discover T2D-associated biomarkers. The sequencing data of T2D patients and healthy individuals were taken from a metagenome-wide association study and categorized into disease states. The sequencing reads were assigned to taxa, and the identified species are used to train and test our model. To deal with the high dimensionality of features, we applied robust feature selection algorithms such as Conditional Mutual Information Maximization, Maximum Relevance and Minimum Redundancy, Correlation Based Feature Selection, and select K best approach. To test the performance of the classification based on the features that are selected by different methods, we used random forest classifier with 100-fold Monte Carlo cross-validation. In our experiments, we observed that 15 commonly selected features have a considerable effect in terms of minimizing the microbiota used for the diagnosis of T2D and thus reducing the time and cost. When we perform biological validation of these identified species, we found that some of them are known as related to T2D development mechanisms and we identified additional species as potential biomarkers. Additionally, we attempted to find the subgroups of T2D patients using k-means clustering. In summary, this study utilizes several supervised and unsupervised machine learning algorithms to increase the diagnostic accuracy of T2D, investigates potential biomarkers of T2D, and finds out which subset of microbiota is more informative than other taxa by applying state-of-the art feature selection methods. en_US
dc.description.sponsorship Abdullah Gul University en_US
dc.identifier.issn 1664-302X
dc.identifier.other PubMed ID34512559
dc.identifier.uri https //doi.org/10.3389/fmicb.2021.628426
dc.identifier.uri https://hdl.handle.net/20.500.12573/1155
dc.identifier.volume Volume 12 en_US
dc.language.iso eng en_US
dc.publisher FRONTIERS MEDIA SAAVENUE DU TRIBUNAL FEDERAL 34, LAUSANNE CH-1015, SWITZERLAND en_US
dc.relation.isversionof 10.3389/fmicb.2021.628426 en_US
dc.relation.journal FRONTIERS IN MICROBIOLOGY en_US
dc.relation.publicationcategory Makale - Uluslararası - Editör Denetimli Dergi en_US
dc.rights info:eu-repo/semantics/openAccess en_US
dc.subject feature selection en_US
dc.subject metagenomic analysis en_US
dc.subject classification en_US
dc.subject machine learning en_US
dc.subject type 2 diabetes en_US
dc.subject human gut microbiome en_US
dc.title Discovering Potential Taxonomic Biomarkers of Type 2 Diabetes From Human Gut Microbiota via Different Feature Selection Methods en_US
dc.type article en_US

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Discovering Potential Taxonomic Biomarkers of Type 2 Diabetes From Human Gut Microbiota via Different Feature Selection Methods.pdf
Size:
2.13 MB
Format:
Adobe Portable Document Format
Description:
Makale Dosyası

License bundle

Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
1.44 KB
Format:
Item-specific license agreed upon to submission
Description: