Effect of Recursive Cluster Elimination with Different Clustering Algorithms Applied to Gene Expression Data

Abstract

Feature selection (FS) is an effective tool in dealing with high dimensionality and reducing computational cost. Support Vector Machines – Recursive Cluster Elimination (SVM-RCE) is one of several algorithms that have been developed for FS in high dimensional data. SVM-RCE involves a clustering step which originally is k-means. Using various performance metrics, three alternative algorithms are evaluated in this context; k-medoids, Hierarchical Clustering (HC), and Gaussian Mixture Model (GMM). Comparisons will be carried out on five publicly available gene expression datasets. The results show that k-means in SVM-RCE obtains higher performance than other tested algorithms in terms of classification performance. Additionally, HC shows a similar performance to k-means. Our findings show superiority of using k-means. This study can contribute to the development of SVMRCE with different variations, leading to decrease in the number of selected genes, and an increase in prediction performance.

Description

Keywords

Recursive Cluster Elimination, Feature Selection, Clustering, Gene Expression Data Analysis

Turkish CoHE Thesis Center URL

Citation

WoS Q

Scopus Q

Source

Volume

Issue

Start Page

1

End Page

4