Prediction of Colorectal Cancer Based on Taxonomic Levels of Microorganisms and Discovery of Taxonomic Biomarkers Using the Grouping-Scoring (G-S-M) Approach

dc.contributor.author Bakir-Güngör, Burcu
dc.contributor.author Temiz, Mustafa
dc.contributor.author Canakcimaksutoglu, Beyza
dc.contributor.author Yousef, Malik
dc.date.accessioned 2025-09-25T10:55:25Z
dc.date.available 2025-09-25T10:55:25Z
dc.date.issued 2025
dc.description.abstract Colorectal cancer (CRC) is one of the most prevalent forms of cancer globally. The human gut microbiome plays an important role in the development of CRC and serves as a biomarker for early detection and treatment. This research effort focuses on the identification of potential taxonomic biomarkers of CRC using a grouping-based feature selection method. Additionally, this study investigates the effect of incorporating biological domain knowledge into the feature selection process while identifying CRC-associated microorganisms. Conventional feature selection techniques often fail to leverage existing biological knowledge during metagenomic data analysis. To address this gap, we propose taxonomy-based Grouping Scoring Modeling (G-S-M) method that integrates biological domain knowledge into feature grouping and selection. In this study, using metagenomic data related to CRC, classification is performed at three taxonomic levels (genus, family and order). The MetaPhlAn tool is employed to determine the relative abundance values of species in each sample. Comparative performance analyses involve six feature selection methods and four classification algorithms. When experimented on two CRC associated metagenomics datasets, the highest performance metric, yielding an AUC of 0.90, is observed at the genus taxonomic level. At this level, 7 out of top 10 groups (Parvimonas, Peptostreptococcus, Fusobacterium, Gemella, Streptococcus, Porphyromonas and Solobacterium) were commonly identified for both datasets. Moreover, the identified microorganisms at genus, family, and order levels are thoroughly discussed via refering to CRC-related metagenomic literature. This study not only contributes to our understanding of CRC development, but also highlights the applicability of taxonomy-based G-S-M method in tackling various diseases. © 2025 Elsevier B.V., All rights reserved. en_US
dc.identifier.doi 10.1016/j.compbiomed.2025.109813
dc.identifier.issn 1879-0534
dc.identifier.issn 0010-4825
dc.identifier.scopus 2-s2.0-85217085175
dc.identifier.uri https://doi.org/10.1016/j.compbiomed.2025.109813
dc.identifier.uri https://hdl.handle.net/20.500.12573/4453
dc.language.iso en en_US
dc.publisher Elsevier Ltd en_US
dc.relation.ispartof Computers in Biology and Medicine en_US
dc.rights info:eu-repo/semantics/closedAccess en_US
dc.subject Colorectal Cancer en_US
dc.subject Human Gut Microbiome en_US
dc.subject Machine Learning en_US
dc.subject Microbiomegsm en_US
dc.subject Taxonomic Biomarkers en_US
dc.subject Biomarkers, Tumor en_US
dc.subject Lung Cancer en_US
dc.subject Colorectal Cancer en_US
dc.subject Features Selection en_US
dc.subject Human Gut Microbiome en_US
dc.subject Human Guts en_US
dc.subject Machine-Learning en_US
dc.subject Metagenomics en_US
dc.subject Microbiome en_US
dc.subject Microbiomegsm en_US
dc.subject Scoring Models en_US
dc.subject Taxonomic Biomarker en_US
dc.subject Article en_US
dc.subject Colorectal Cancer en_US
dc.subject Desulfovibrionales en_US
dc.subject DNA Damage en_US
dc.subject Feature Selection en_US
dc.subject Fusobacteriaceae en_US
dc.subject Fusobacterium en_US
dc.subject Gemella en_US
dc.subject Human en_US
dc.subject Intestine Flora en_US
dc.subject Lachnospiraceae en_US
dc.subject Lactobacillales en_US
dc.subject Machine Learning en_US
dc.subject Metagenomics en_US
dc.subject Microbial Community en_US
dc.subject Nonhuman en_US
dc.subject Parvimonas en_US
dc.subject Peptostreptococcaceae en_US
dc.subject Peptostreptococcus en_US
dc.subject Performance Indicator en_US
dc.subject Porphyromonas en_US
dc.subject Prediction en_US
dc.subject Solobacterium en_US
dc.subject Streptococcaceae en_US
dc.subject Streptococcus en_US
dc.subject Streptococcus Equinus en_US
dc.subject Taxonomy en_US
dc.subject Algorithm en_US
dc.subject Bacterium en_US
dc.subject Classification en_US
dc.subject Colorectal Tumor en_US
dc.subject Genetics en_US
dc.subject Microbiology en_US
dc.subject Procedures en_US
dc.subject Tumor Marker en_US
dc.subject Algorithms en_US
dc.subject Bacteria en_US
dc.subject Biomarkers, Tumor en_US
dc.subject Colorectal Neoplasms en_US
dc.subject Gastrointestinal Microbiome en_US
dc.subject Humans en_US
dc.title Prediction of Colorectal Cancer Based on Taxonomic Levels of Microorganisms and Discovery of Taxonomic Biomarkers Using the Grouping-Scoring (G-S-M) Approach en_US
dc.type Article en_US
dspace.entity.type Publication
gdc.author.scopusid 25932029800
gdc.author.scopusid 57219794472
gdc.author.scopusid 59545047900
gdc.author.scopusid 14029389000
gdc.bip.impulseclass C5
gdc.bip.influenceclass C5
gdc.bip.popularityclass C5
gdc.coar.access metadata only access
gdc.coar.type text::journal::journal article
gdc.collaboration.industrial false
gdc.description.department Abdullah Gül University en_US
gdc.description.departmenttemp [Bakir-Güngör] Burcu, Department of Computer Engineering, Abdullah Gül Üniversitesi, Kayseri, Turkey; [Temiz] Mustafa, Department of Electrical & Computer Engineering, Abdullah Gül Üniversitesi, Kayseri, Turkey; [Canakcimaksutoglu] Beyza, Department of Bioengineering, Abdullah Gül Üniversitesi, Kayseri, Turkey; [Yousef] Malik, Department of Information Systems, Zefat Academic College, Safad, Israel, Zefat Academic College, Safad, Israel en_US
gdc.description.publicationcategory Makale - Uluslararası Hakemli Dergi - Kurum Öğretim Elemanı en_US
gdc.description.scopusquality Q1
gdc.description.volume 187 en_US
gdc.description.wosquality Q1
gdc.identifier.openalex W4407282348
gdc.identifier.pmid 39929003
gdc.index.type Scopus
gdc.index.type PubMed
gdc.oaire.diamondjournal false
gdc.oaire.impulse 1.0
gdc.oaire.influence 2.5131481E-9
gdc.oaire.isgreen false
gdc.oaire.keywords Bacteria
gdc.oaire.keywords Biomarkers, Tumor
gdc.oaire.keywords Humans
gdc.oaire.keywords Metagenomics
gdc.oaire.keywords Colorectal Neoplasms
gdc.oaire.keywords Algorithms
gdc.oaire.keywords Gastrointestinal Microbiome
gdc.oaire.popularity 3.4952221E-9
gdc.oaire.publicfunded false
gdc.openalex.collaboration International
gdc.openalex.fwci 0.8536
gdc.openalex.normalizedpercentile 0.7
gdc.openalex.toppercent TOP 10%
gdc.opencitations.count 0
gdc.plumx.mendeley 7
gdc.plumx.scopuscites 2
gdc.scopus.citedcount 2
gdc.virtual.author Güngör, Burcu
relation.isAuthorOfPublication e17be1f8-1c9a-45f2-bf0d-f8b348d2dba0
relation.isAuthorOfPublication.latestForDiscovery e17be1f8-1c9a-45f2-bf0d-f8b348d2dba0
relation.isOrgUnitOfPublication 665d3039-05f8-4a25-9a3c-b9550bffecef
relation.isOrgUnitOfPublication 52f507ab-f278-4a1f-824c-44da2a86bd51
relation.isOrgUnitOfPublication ef13a800-4c99-4124-81e0-3e25b33c0c2b
relation.isOrgUnitOfPublication.latestForDiscovery 665d3039-05f8-4a25-9a3c-b9550bffecef

Files