CCPred: Global and Population-Specific Colorectal Cancer Prediction and Metagenomic Biomarker Identification at Different Molecular Levels Using Machine Learning Techniques

Loading...
Publication Logo

Date

2024

Journal Title

Journal ISSN

Volume Title

Publisher

Elsevier Ltd

Open Access Color

Green Open Access

No

OpenAIRE Downloads

OpenAIRE Views

Publicly Funded

No
Impulse
Top 10%
Influence
Average
Popularity
Top 10%

Research Projects

Journal Issue

Abstract

Colorectal cancer (CRC) ranks as the third most common cancer globally and the second leading cause of cancer-related deaths. Recent research highlights the pivotal role of the gut microbiota in CRC development and progression. Understanding the complex interplay between disease development and metagenomic data is essential for CRC diagnosis and treatment. Current computational models employ machine learning to identify metagenomic biomarkers associated with CRC, yet there is a need to improve their accuracy through a holistic biological knowledge perspective. This study aims to evaluate CRC-associated metagenomic data at species, enzymes, and pathway levels via conducting global and population-specific analyses. These analyses utilize relative abundance values from human gut microbiome sequencing data and robust classification models are built for disease prediction and biomarker identification. For global CRC prediction and biomarker identification, the features that are identified by SelectKBest (SKB), Information Gain (IG), and Extreme Gradient Boosting (XGBoost) methods are combined. Population-based analysis includes within-population, leave-one-dataset-out (LODO) and cross-population approaches. Four classification algorithms are employed for CRC classification. Random Forest achieved an AUC of 0.83 for species data, 0.78 for enzyme data and 0.76 for pathway data globally. On the global scale, potential taxonomic biomarkers include ruthenibacterium lactatiformanas; enzyme biomarkers include RNA 2′ 3′ cyclic 3′ phosphodiesterase; and pathway biomarkers include pyruvate fermentation to acetone pathway. This study underscores the potential of machine learning models trained on metagenomic data for improved disease prediction and biomarker discovery. The proposed model and associated files are available at https://github.com/TemizMus/CCPRED. © 2024 Elsevier B.V., All rights reserved.

Description

Keywords

Biomarkers, Colorectal Cancer, Enzyme, Machine Learning, Metagenomic, Microbiome, Pathway, Species, Phosphodiesterase, Pyruvic Acid, Biomarkers, Tumor, Adversarial Machine Learning, Lung Cancer, Biomarker Identification, Cancer Prediction, Colorectal Cancer, Machine Learning Techniques, Machine-Learning, Metagenomics, Microbiome, Molecular Levels, Pathway, Species, Plant Diseases, Phosphodiesterase, Pyruvic Acid, Rna 2',3' Cyclic 3' Phosphodiesterase, Tumor Marker, Unclassified Drug, Adult, Aged, Anaerobic Bacterium, Area Under The Curve, Article, Classification Algorithm, Colorectal Cancer, Computer Model, Controlled Study, Decision Tree, Diagnostic Accuracy, Feature Selection Algorithm, Female, Human, Intestine Flora, Leave One Out Cross Validation, Machine Learning, Major Clinical Study, Male, Metagenomics, Monte Carlo Cross Validation, Population Research, Prediction, Random Forest, Ruthenibacterium Lactatiformanas, Sensitivity and Specificity, Colorectal Tumor, Genetics, Metabolism, Metagenome, Microbiology, Procedures, Software, Biomarkers, Tumor, Colorectal Neoplasms, Gastrointestinal Microbiome, Humans, Machine Learning, Metagenome, Software, Machine Learning, Biomarkers, Tumor, Humans, Metagenome, Metagenomics, Colorectal Neoplasms, Software, Gastrointestinal Microbiome

Fields of Science

Citation

WoS Q

Q1

Scopus Q

Q1
OpenCitations Logo
OpenCitations Citation Count
3

Source

Computers in Biology and Medicine

Volume

182

Issue

Start Page

End Page

PlumX Metrics
Citations

Scopus : 4

Captures

Mendeley Readers : 14

SCOPUS™ Citations

4

checked on Mar 06, 2026

Page Views

1

checked on Mar 06, 2026

Downloads

9

checked on Mar 06, 2026

Google Scholar Logo
Google Scholar™
OpenAlex Logo
OpenAlex FWCI
1.2503

Sustainable Development Goals

3

GOOD HEALTH AND WELL-BEING
GOOD HEALTH AND WELL-BEING Logo