Prediction of Colorectal Cancer Based on Taxonomic Levels of Microorganisms and Discovery of Taxonomic Biomarkers Using the Grouping-Scoring (G-S-M) Approach
Loading...

Date
2025
Journal Title
Journal ISSN
Volume Title
Publisher
Elsevier Ltd
Open Access Color
Green Open Access
No
OpenAIRE Downloads
OpenAIRE Views
Publicly Funded
No
Abstract
Colorectal cancer (CRC) is one of the most prevalent forms of cancer globally. The human gut microbiome plays an important role in the development of CRC and serves as a biomarker for early detection and treatment. This research effort focuses on the identification of potential taxonomic biomarkers of CRC using a grouping-based feature selection method. Additionally, this study investigates the effect of incorporating biological domain knowledge into the feature selection process while identifying CRC-associated microorganisms. Conventional feature selection techniques often fail to leverage existing biological knowledge during metagenomic data analysis. To address this gap, we propose taxonomy-based Grouping Scoring Modeling (G-S-M) method that integrates biological domain knowledge into feature grouping and selection. In this study, using metagenomic data related to CRC, classification is performed at three taxonomic levels (genus, family and order). The MetaPhlAn tool is employed to determine the relative abundance values of species in each sample. Comparative performance analyses involve six feature selection methods and four classification algorithms. When experimented on two CRC associated metagenomics datasets, the highest performance metric, yielding an AUC of 0.90, is observed at the genus taxonomic level. At this level, 7 out of top 10 groups (Parvimonas, Peptostreptococcus, Fusobacterium, Gemella, Streptococcus, Porphyromonas and Solobacterium) were commonly identified for both datasets. Moreover, the identified microorganisms at genus, family, and order levels are thoroughly discussed via refering to CRC-related metagenomic literature. This study not only contributes to our understanding of CRC development, but also highlights the applicability of taxonomy-based G-S-M method in tackling various diseases. © 2025 Elsevier B.V., All rights reserved.
Description
Keywords
Colorectal Cancer, Human Gut Microbiome, Machine Learning, Microbiomegsm, Taxonomic Biomarkers, Biomarkers, Tumor, Lung Cancer, Colorectal Cancer, Features Selection, Human Gut Microbiome, Human Guts, Machine-Learning, Metagenomics, Microbiome, Microbiomegsm, Scoring Models, Taxonomic Biomarker, Article, Colorectal Cancer, Desulfovibrionales, DNA Damage, Feature Selection, Fusobacteriaceae, Fusobacterium, Gemella, Human, Intestine Flora, Lachnospiraceae, Lactobacillales, Machine Learning, Metagenomics, Microbial Community, Nonhuman, Parvimonas, Peptostreptococcaceae, Peptostreptococcus, Performance Indicator, Porphyromonas, Prediction, Solobacterium, Streptococcaceae, Streptococcus, Streptococcus Equinus, Taxonomy, Algorithm, Bacterium, Classification, Colorectal Tumor, Genetics, Microbiology, Procedures, Tumor Marker, Algorithms, Bacteria, Biomarkers, Tumor, Colorectal Neoplasms, Gastrointestinal Microbiome, Humans, Bacteria, Biomarkers, Tumor, Humans, Metagenomics, Colorectal Neoplasms, Algorithms, Gastrointestinal Microbiome
Fields of Science
Citation
WoS Q
Q1
Scopus Q
Q1

OpenCitations Citation Count
N/A
Source
Computers in Biology and Medicine
Volume
187
Issue
Start Page
End Page
PlumX Metrics
Citations
Scopus : 2
Captures
Mendeley Readers : 7
Google Scholar™


