The Effect of Different Classifiers on Recursive Cluster Elimination in the Analysis of Transcriptomic Data
Loading...
Date
2023
Journal Title
Journal ISSN
Volume Title
Publisher
Institute of Electrical and Electronics Engineers Inc.
Abstract
Gene expression data with limited sample size and
a large number of genes are frequently encountered in genetic
studies. In such high-dimensional data, identification of genes
that distinguish between disease states is a challenging task.
Feature selection (FS) is a useful approach in dealing with high
dimensionality. Support Vector Machines Recursive Cluster
Elimination (SVM-RCE) is a technique for FS in highdimensional data. The SVM-RCE approach has been utilized
for identification of clusters of genes whose expression levels
correlate with pathological state. A key step in SVM-RCE is the
use of an SVM classifier to assign an area under the curve (AUC)
score to each gene cluster based on its ability to predict class
labels. In this study, we investigate the use of alternative
classifiers in the cluster-scoring step. Specifically, we compare
Support Vector Machines, Random Forest, XgBoost, Naive
Bayes, and linear logistic regression. In addition to AUC score
performance evaluation, the algorithms are compared in terms
of the number of selected genes at different levels of clustering
and in terms of the running time.
Description
Keywords
Recursive Cluster Elimination, Feature Selection, Clustering, Gene Expression Data Analysis
Turkish CoHE Thesis Center URL
Citation
WoS Q
Scopus Q
Source
Volume
Issue
Start Page
1
End Page
5