Scopus İndeksli Yayınlar Koleksiyonu
Permanent URI for this collectionhttps://hdl.handle.net/20.500.12573/395
Browse
6 results
Search Results
Article Citation - Scopus: 4RCE-IFE: Recursive Cluster Elimination with Intra-Cluster Feature Elimination(PeerJ Inc., 2025-02-07) Kuzudisli, Cihan; Bakir-Gungor, Burcu; Qaqish, Bahjat; Yousef, MalikConference Object The Effect of Different Classifiers on Recursive Cluster Elimination in the Analysis of Transcriptomic Data(Institute of Electrical and Electronics Engineers Inc., 2023-10-11) Bulut, Nurten; Bakir-Güngör, Burcu; Qaqish, Bahjat F.; Yousef, MalikGene expression data with limited sample size and a large number of genes are frequently encountered in genetic studies. In such high-dimensional data, identification of genes that distinguish between disease states is a challenging task. Feature selection (FS) is a useful approach in dealing with high dimensionality. Support Vector Machines Recursive Cluster Elimination (SVM-RCE) is a technique for FS in high-dimensional data. The SVM-RCE approach has been utilized for identification of clusters of genes whose expression levels correlate with pathological state. A key step in SVM-RCE is the use of an SVM classifier to assign an area under the curve (AUC) score to each gene cluster based on its ability to predict class labels. In this study, we investigate the use of alternative classifiers in the cluster-scoring step. Specifically, we compare Support Vector Machines, Random Forest, XgBoost, Naive Bayes, and linear logistic regression. In addition to AUC score performance evaluation, the algorithms are compared in terms of the number of selected genes at different levels of clustering and in terms of the running time. © 2023 Elsevier B.V., All rights reserved.Article Citation - Scopus: 25Recursive Cluster Elimination Based Rank Function (SVM-RCE-R) Implemented in KNIME(F1000 Research Ltd, 2021-01-05) Yousef, Malik; Bakir-Güngör, Burcu; Jabeer, Amhar; Göy, Gökhan; Qureshi, Rehman A.; C Showe, Louise; C. Showe, LouiseIn our earlier study, we proposed a novel feature selection approach, Recursive Cluster Elimination with Support Vector Machines (SVM-RCE) and implemented this approach in Matlab. Interest in this approach has grown over time and several researchers have incorporated SVM-RCE into their studies, resulting in a substantial number of scientific publications. This increased interest encouraged us to reconsider how feature selection, particularly in biological datasets, can benefit from considering the relationships of those genes in the selection process, this led to our development of SVM-RCE-R. SVM-RCE-R, further enhances the capabilities of SVM-RCE by the addition of a novel user specified ranking function. This ranking function enables the user to stipulate the weights of the accuracy, sensitivity, specificity, f-measure, area under the curve and the precision in the ranking function This flexibility allows the user to select for greater sensitivity or greater specificity as needed for a specific project. The usefulness of SVM-RCE-R is further supported by development of the maTE tool which uses a similar approach to identify MicroRNA (miRNA) targets. We have also now implemented the SVM-RCE-R algorithm in Knime in order to make it easier to applyThe use of SVM-RCE-R in Knime is simple and intuitive and allows researchers to immediately begin their analysis without having to consult an information technology specialist. The input for the Knime implemented tool is an EXCEL file (or text or CSV) with a simple structure and the output is also an EXCEL file. The Knime version also incorporates new features not available in SVM-RCE. The results show that the inclusion of the ranking function has a significant impact on the performance of SVM-RCE-R. Some of the clusters that achieve high scores for a specified ranking can also have high scores in other metrics. © 2021 Elsevier B.V., All rights reserved.Article Citation - WoS: 2Citation - Scopus: 4RCE-IFE: Recursive Cluster Elimination With Intra-Cluster Feature Elimination(PeerJ Inc, 2025-02-07) Kuzudisli, Cihan; Bakir-Gungor, Burcu; Qaqish, Bahjat; Yousef, MalikThe computational and interpretational difficulties caused by the ever-increasing dimensionality of biological data generated by new technologies pose a significant challenge. Feature selection (FS) methods aim to reduce the dimension, and feature grouping has emerged as a foundation for FS techniques that seek to detect strong correlations among features and identify irrelevant features. In this work, we propose the Recursive Cluster Elimination with Intra-Cluster Feature Elimination (RCE-IFE) method that utilizes feature grouping and iterates grouping and elimination steps in a supervised context. We assess dimensionality reduction and discriminatory capabilities of RCE-IFE on various high-dimensional datasets from different biological domains. For a set of gene expression, MicroRNA (miRNA) expression, and methylation datasets, the performance of RCE-IFE is comparatively evaluated with RCE-IFE-SVM (the SVM-adapted version of RCE-IFE) and SVM-RCE. On average, RCE-IFE attains an area under the curve (AUC) of 0.85 among tested expression datasets with the fewest features and the shortest running time, while RCE-IFE-SVM (the SVM-adapted version of RCE-IFE) and SVM-RCE achieve similar AUCs of 0.84 and 0.83, respectively. RCE-IFE and SVM-RCE yield AUCs of 0.79 and 0.68, respectively when averaged over seven different metagenomics datasets, with RCE-IFE significantly reducing feature subsets. Furthermore, RCE-IFE surpasses several state-of-the-art FS methods, such as Minimum Redundancy Maximum Relevance (MRMR), Fast Correlation-Based Filter (FCBF), Information Gain (IG), Conditional Mutual Information Maximization (CMIM), SelectKBest (SKB), and eXtreme Gradient Boosting (XGBoost), obtaining an average AUC of 0.76 on five gene expression datasets. Compared with a similar tool, Multi-stage, RCE-IFE gives a similar average accuracy rate of 89.27% using fewer features on four cancer-related datasets. The comparability of RCE-IFE is also verified with other biological domain knowledge-based Grouping-Scoring-Modeling (G-S-M) tools, including mirGediNET, 3Mint, and miRcorrNet. Additionally, the biological relevance of the selected features by RCE-IFE is evaluated. The proposed method also exhibits high consistency in terms of the selected features across multiple runs. Our experimental findings imply that RCE-IFE provides robust classifier performance and significantly reduces feature size while maintaining feature relevance and consistency.Conference Object Enhancing Gene Expression Data Analysis Through SVM-Based Recursive Cluster Elimination and Weighted Center Approaches(Avestia Publishing, 2024-08) Yousef, Malik; Bulut, Nurten; Gungor, Burcu Bakir; Qaqish, Bahjat F.The complexity and high dimensionality of gene expression data pose significant challenges for effective feature selection and accurate classification in bioinformatics. This study introduces two novel algorithms, Support Vector Machine-Recursive Cluster Elimination (SVM-RCE) and its advanced version, SVM-RCE with Center Weights (SVM-RCE-CW), designed to optimize feature selection by leveraging clustering techniques and machine learning models. Both algorithms aim to reduce the feature space, thereby enhancing the interpretability and performance of classification models. We present a comprehensive comparison of these methods against traditional feature selection techniques, demonstrating their efficacy in achieving significant dimensionality reduction while maintaining or improving classification accuracy in several gene expression datasets. © 2024 Elsevier B.V., All rights reserved.Conference Object Citation - Scopus: 2Effect of Recursive Cluster Elimination With Different Clustering Algorithms Applied to Gene Expression Data(Institute of Electrical and Electronics Engineers Inc., 2023-10-11) Kuzudisli, Cihan; Bakir-Güngör, Burcu; Qaqish, Bahjat F.; Yousef, MalikFeature selection (FS) is an effective tool in dealing with high dimensionality and reducing computational cost. Support Vector Machines-Recursive Cluster Elimination (SVM-RCE) is one of several algorithms that have been developed for FS in high dimensional data. SVM-RCE involves a clustering step which originally is k-means. Using various performance metrics, three alternative algorithms are evaluated in this context; k-medoids, Hierarchical Clustering (HC), and Gaussian Mixture Model (GMM). Comparisons will be carried out on five publicly available gene expression datasets. The results show that k-means in SVM-RCE obtains higher performance than other tested algorithms in terms of classification performance. Additionally, HC shows a similar performance to k-means. Our findings show superiority of using k-means. This study can contribute to the development of SVM-RCE with different variations, leading to decrease in the number of selected genes, and an increase in prediction performance. © 2023 Elsevier B.V., All rights reserved.
