PriPath: Identifying Dysregulated Pathways From Differential Gene Expression via Grouping, Scoring, and Modeling With an Embedded Feature Selection Approach

No Thumbnail Available

Date

2023

Journal Title

Journal ISSN

Volume Title

Publisher

BMC

Open Access Color

GOLD

Green Open Access

Yes

OpenAIRE Downloads

102

OpenAIRE Views

162

Publicly Funded

No
Impulse
Top 10%
Influence
Average
Popularity
Top 10%

Research Projects

Journal Issue

Abstract

BackgroundCell homeostasis relies on the concerted actions of genes, and dysregulated genes can lead to diseases. In living organisms, genes or their products do not act alone but within networks. Subsets of these networks can be viewed as modules that provide specific functionality to an organism. The Kyoto encyclopedia of genes and genomes (KEGG) systematically analyzes gene functions, proteins, and molecules and combines them into pathways. Measurements of gene expression (e.g., RNA-seq data) can be mapped to KEGG pathways to determine which modules are affected or dysregulated in the disease. However, genes acting in multiple pathways and other inherent issues complicate such analyses. Many current approaches may only employ gene expression data and need to pay more attention to some of the existing knowledge stored in KEGG pathways for detecting dysregulated pathways. New methods that consider more precompiled information are required for a more holistic association between gene expression and diseases.ResultsPriPath is a novel approach that transfers the generic process of grouping and scoring, followed by modeling to analyze gene expression with KEGG pathways. In PriPath, KEGG pathways are utilized as the grouping function as part of a machine learning algorithm for selecting the most significant KEGG pathways. A machine learning model is trained to differentiate between diseases and controls using those groups. We have tested PriPath on 13 gene expression datasets of various cancers and other diseases. Our proposed approach successfully assigned biologically and clinically relevant KEGG terms to the samples based on the differentially expressed genes. We have comparatively evaluated the performance of PriPath against other tools, which are similar in their merit. For each dataset, we manually confirmed the top results of PriPath in the literature and found that most predictions can be supported by previous experimental research.ConclusionsPriPath can thus aid in determining dysregulated pathways, which applies to medical diagnostics. In the future, we aim to advance this approach so that it can perform patient stratification based on gene expression and identify druggable targets. Thereby, we cover two aspects of precision medicine.

Description

Allmer, Jens/0000-0002-2164-7335; Yousef, Malik/0000-0001-8780-6303

Keywords

Feature Selection, Feature Scoring, Feature Grouping, Biological Knowledge Integration, Kegg Pathway, Classification, Gene Expression, Enrichment Analysis, Machine Learning, Bioinformatics, Data Science, Data Mining, Genomics, Bioinformatics, QH301-705.5, Computer applications to medicine. Medical informatics, R858-859.7, Gene Expression, Data science, Biological knowledge integration, Neoplasms, Machine learning, Humans, KEGG pathway, Biology (General), Data mining, Enrichment analysis, Genome, Gene Expression Profiling, Feature grouping, Computational Biology, Genomics, Classification, Classification, Feature selection, Gene expression, Feature scoring, Algorithms, Research Article

Turkish CoHE Thesis Center URL

Fields of Science

0301 basic medicine, 03 medical and health sciences, 0303 health sciences

Citation

WoS Q

Q1

Scopus Q

Q2
OpenCitations Logo
OpenCitations Citation Count
14

Source

Bmc Bioinformatics

Volume

24

Issue

1

Start Page

End Page

PlumX Metrics
Citations

CrossRef : 6

Scopus : 15

PubMed : 11

Captures

Mendeley Readers : 18

SCOPUS™ Citations

15

checked on Feb 03, 2026

Web of Science™ Citations

15

checked on Feb 03, 2026

Page Views

3

checked on Feb 03, 2026

Google Scholar Logo
Google Scholar™
OpenAlex Logo
OpenAlex FWCI
4.45608978
Altmetrics Badge

Sustainable Development Goals

3

GOOD HEALTH AND WELL-BEING
GOOD HEALTH AND WELL-BEING Logo

9

INDUSTRY, INNOVATION AND INFRASTRUCTURE
INDUSTRY, INNOVATION AND INFRASTRUCTURE Logo

12

RESPONSIBLE CONSUMPTION AND PRODUCTION
RESPONSIBLE CONSUMPTION AND PRODUCTION Logo