Prediction of Colorectal Cancer Based on Taxonomic Levels of Microorganisms and Discovery of Taxonomic Biomarkers Using the Grouping-Scoring (G-S-M) Approach

Loading...
Publication Logo

Date

2025

Journal Title

Journal ISSN

Volume Title

Publisher

Elsevier Ltd

Open Access Color

Green Open Access

No

OpenAIRE Downloads

OpenAIRE Views

Publicly Funded

No
Impulse
Average
Influence
Average
Popularity
Average

Research Projects

Journal Issue

Abstract

Colorectal cancer (CRC) is one of the most prevalent forms of cancer globally. The human gut microbiome plays an important role in the development of CRC and serves as a biomarker for early detection and treatment. This research effort focuses on the identification of potential taxonomic biomarkers of CRC using a grouping-based feature selection method. Additionally, this study investigates the effect of incorporating biological domain knowledge into the feature selection process while identifying CRC-associated microorganisms. Conventional feature selection techniques often fail to leverage existing biological knowledge during metagenomic data analysis. To address this gap, we propose taxonomy-based Grouping Scoring Modeling (G-S-M) method that integrates biological domain knowledge into feature grouping and selection. In this study, using metagenomic data related to CRC, classification is performed at three taxonomic levels (genus, family and order). The MetaPhlAn tool is employed to determine the relative abundance values of species in each sample. Comparative performance analyses involve six feature selection methods and four classification algorithms. When experimented on two CRC associated metagenomics datasets, the highest performance metric, yielding an AUC of 0.90, is observed at the genus taxonomic level. At this level, 7 out of top 10 groups (Parvimonas, Peptostreptococcus, Fusobacterium, Gemella, Streptococcus, Porphyromonas and Solobacterium) were commonly identified for both datasets. Moreover, the identified microorganisms at genus, family, and order levels are thoroughly discussed via refering to CRC-related metagenomic literature. This study not only contributes to our understanding of CRC development, but also highlights the applicability of taxonomy-based G-S-M method in tackling various diseases. © 2025 Elsevier B.V., All rights reserved.

Description

Keywords

Colorectal Cancer, Human Gut Microbiome, Machine Learning, Microbiomegsm, Taxonomic Biomarkers, Biomarkers, Tumor, Lung Cancer, Colorectal Cancer, Features Selection, Human Gut Microbiome, Human Guts, Machine-Learning, Metagenomics, Microbiome, Microbiomegsm, Scoring Models, Taxonomic Biomarker, Article, Colorectal Cancer, Desulfovibrionales, DNA Damage, Feature Selection, Fusobacteriaceae, Fusobacterium, Gemella, Human, Intestine Flora, Lachnospiraceae, Lactobacillales, Machine Learning, Metagenomics, Microbial Community, Nonhuman, Parvimonas, Peptostreptococcaceae, Peptostreptococcus, Performance Indicator, Porphyromonas, Prediction, Solobacterium, Streptococcaceae, Streptococcus, Streptococcus Equinus, Taxonomy, Algorithm, Bacterium, Classification, Colorectal Tumor, Genetics, Microbiology, Procedures, Tumor Marker, Algorithms, Bacteria, Biomarkers, Tumor, Colorectal Neoplasms, Gastrointestinal Microbiome, Humans, Bacteria, Biomarkers, Tumor, Humans, Metagenomics, Colorectal Neoplasms, Algorithms, Gastrointestinal Microbiome

Fields of Science

Citation

WoS Q

Q1

Scopus Q

Q1
OpenCitations Logo
OpenCitations Citation Count
N/A

Source

Computers in Biology and Medicine

Volume

187

Issue

Start Page

End Page

PlumX Metrics
Citations

Scopus : 2

Captures

Mendeley Readers : 7

Google Scholar Logo
Google Scholar™
OpenAlex Logo
OpenAlex FWCI
0.8536

Sustainable Development Goals