Metagenomic Data Analysis With Machine Learning to Discover Colorectal Cancer-Associated Enzymes
| dc.contributor.author | Ersoz, Nur Sebnem | |
| dc.contributor.author | Kuzudisli, Cihan | |
| dc.contributor.author | Yousef, Malik | |
| dc.contributor.author | Bakir-Gungor, Burcu | |
| dc.date.accessioned | 2025-09-25T10:50:40Z | |
| dc.date.available | 2025-09-25T10:50:40Z | |
| dc.date.issued | 2024 | |
| dc.description.abstract | The human gut microbiome comprises over 10 trillion microbes and plays important roles in maintaining metabolism, body homeostasis, impacting immune function. Metagenomics which studies genomic data from clinical and environmental samples is crucial in understanding the interplay between the host and the gut microbiome. Recently, functional profiling of metagenomes helps to identify alterations in microbial functions, particularly enzyme-encoding genes. Colorectal cancer (CRC) is known as one of the leading causes of cancer-related deaths. In this study, we aimed to find the CRC-associated enzymes by analyzing metagenomic data with different machine learning methods. A total of 1262 samples including CRC and control groups from different countries were used in this study. This dataset was obtained by functionally profiling metagenomics data and estimating community level enzyme commission (EC) abundance values. For the analysis of this dataset, RCE-IFE and SVM-RCE machine learning methods, which are group-based feature selection methods, were compared with 6 different individual feature selection methods. 10 times Monte-Carlo Cross Validation was used in our experiments. It was observed that RCE-IFE, Extreme Gradient Boosting and Select K Best methods similarly provided the best performances. Especially in this study, besides the its high performance, the group-based feature selection method RCE-IFE grouped enzymes into clusters unlike TFS, and then identified biologically relevant CRC-associated enzymes. | en_US |
| dc.description.sponsorship | Berdan Civata B.C.; et al.; Figes; Koluman; Loodos; Tarsus University | |
| dc.identifier.doi | 10.1109/SIU61531.2024.10601144 | |
| dc.identifier.isbn | 9798350388978 | |
| dc.identifier.isbn | 9798350388961 | |
| dc.identifier.issn | 2165-0608 | |
| dc.identifier.scopus | 2-s2.0-85200856780 | |
| dc.identifier.uri | https://doi.org/10.1109/SIU61531.2024.10601144 | |
| dc.identifier.uri | https://hdl.handle.net/20.500.12573/4187 | |
| dc.language.iso | en | en_US |
| dc.publisher | IEEE | en_US |
| dc.relation.ispartof | 32nd IEEE Signal Processing and Communications Applications Conference (SIU) -- MAY 15-18, 2024 -- Tarsus Univ Campus, Mersin, TURKEY | en_US |
| dc.relation.ispartofseries | Signal Processing and Communications Applications Conference | |
| dc.rights | info:eu-repo/semantics/closedAccess | en_US |
| dc.subject | Colorectal Cancer Diagnosis | en_US |
| dc.subject | Metagenomics Data Analysis | en_US |
| dc.subject | Community-Level Enzyme Commission (EC) Abundance Values | en_US |
| dc.subject | Machine Learning | en_US |
| dc.subject | Grouping Based Feature Selection | en_US |
| dc.title | Metagenomic Data Analysis With Machine Learning to Discover Colorectal Cancer-Associated Enzymes | en_US |
| dc.type | Conference Object | en_US |
| dspace.entity.type | Publication | |
| gdc.author.scopusid | 57423006700 | |
| gdc.author.scopusid | 57219838821 | |
| gdc.author.scopusid | 14029389000 | |
| gdc.author.scopusid | 25932029800 | |
| gdc.bip.impulseclass | C5 | |
| gdc.bip.influenceclass | C5 | |
| gdc.bip.popularityclass | C5 | |
| gdc.coar.access | metadata only access | |
| gdc.coar.type | text::conference output | |
| gdc.collaboration.industrial | false | |
| gdc.description.department | Abdullah Gül University | en_US |
| gdc.description.departmenttemp | [Ersoz, Nur Sebnem] Abdullah Gul Univ, Fac Life & Nat Sci, Dept Bioengn, Kayseri, Turkiye; [Kuzudisli, Cihan] Basalt Kalyoncu Univ, Fac Engn, Dept Comp Engn, Gaziantep, Turkiye; [Yousef, Malik] Zefat Acad Coll, Galilee Digital Hlth Res Ctr, Dept Informat Syst, Safed, Israel; [Bakir-Gungor, Burcu] Abdullah Gul Univ, Fac Engn, Dept Comp Engn, Kayseri, Turkiye | en_US |
| gdc.description.endpage | 4 | |
| gdc.description.publicationcategory | Konferans Öğesi - Uluslararası - Kurum Öğretim Elemanı | en_US |
| gdc.description.scopusquality | N/A | |
| gdc.description.startpage | 1 | |
| gdc.description.woscitationindex | Conference Proceedings Citation Index - Science | |
| gdc.description.wosquality | N/A | |
| gdc.identifier.openalex | W4400908949 | |
| gdc.identifier.wos | WOS:001297894700330 | |
| gdc.index.type | WoS | |
| gdc.index.type | Scopus | |
| gdc.oaire.diamondjournal | false | |
| gdc.oaire.impulse | 0.0 | |
| gdc.oaire.influence | 2.4895952E-9 | |
| gdc.oaire.isgreen | false | |
| gdc.oaire.keywords | machine learning | |
| gdc.oaire.keywords | metagenomics data analysis | |
| gdc.oaire.keywords | grouping based feature selection | |
| gdc.oaire.keywords | colorectal cancer diagnosis | |
| gdc.oaire.keywords | community-level enzyme commission (EC) abundance values | |
| gdc.oaire.popularity | 2.3737945E-9 | |
| gdc.oaire.publicfunded | false | |
| gdc.oaire.sciencefields | 0301 basic medicine | |
| gdc.oaire.sciencefields | 0303 health sciences | |
| gdc.oaire.sciencefields | 03 medical and health sciences | |
| gdc.openalex.collaboration | International | |
| gdc.openalex.fwci | 0.0 | |
| gdc.openalex.normalizedpercentile | 0.11 | |
| gdc.opencitations.count | 0 | |
| gdc.plumx.mendeley | 1 | |
| gdc.plumx.scopuscites | 0 | |
| gdc.scopus.citedcount | 0 | |
| gdc.virtual.author | Güngör, Burcu | |
| gdc.wos.citedcount | 0 | |
| relation.isAuthorOfPublication | e17be1f8-1c9a-45f2-bf0d-f8b348d2dba0 | |
| relation.isAuthorOfPublication.latestForDiscovery | e17be1f8-1c9a-45f2-bf0d-f8b348d2dba0 | |
| relation.isOrgUnitOfPublication | 665d3039-05f8-4a25-9a3c-b9550bffecef | |
| relation.isOrgUnitOfPublication | 52f507ab-f278-4a1f-824c-44da2a86bd51 | |
| relation.isOrgUnitOfPublication | ef13a800-4c99-4124-81e0-3e25b33c0c2b | |
| relation.isOrgUnitOfPublication.latestForDiscovery | 665d3039-05f8-4a25-9a3c-b9550bffecef |
