RCE-IFE: Recursive Cluster Elimination With Intra-Cluster Feature Elimination
| dc.contributor.author | Kuzudisli, Cihan | |
| dc.contributor.author | Bakir-Gungor, Burcu | |
| dc.contributor.author | Qaqish, Bahjat | |
| dc.contributor.author | Yousef, Malik | |
| dc.date.accessioned | 2025-09-25T10:55:58Z | |
| dc.date.available | 2025-09-25T10:55:58Z | |
| dc.date.issued | 2025 | |
| dc.description | Yousef, Malik/0000-0001-8780-6303 | en_US |
| dc.description.abstract | The computational and interpretational difficulties caused by the ever-increasing dimensionality of biological data generated by new technologies pose a significant challenge. Feature selection (FS) methods aim to reduce the dimension, and feature grouping has emerged as a foundation for FS techniques that seek to detect strong correlations among features and identify irrelevant features. In this work, we propose the Recursive Cluster Elimination with Intra-Cluster Feature Elimination (RCE-IFE) method that utilizes feature grouping and iterates grouping and elimination steps in a supervised context. We assess dimensionality reduction and discriminatory capabilities of RCE-IFE on various high-dimensional datasets from different biological domains. For a set of gene expression, MicroRNA (miRNA) expression, and methylation datasets, the performance of RCE-IFE is comparatively evaluated with RCE-IFE-SVM (the SVM-adapted version of RCE-IFE) and SVM-RCE. On average, RCE-IFE attains an area under the curve (AUC) of 0.85 among tested expression datasets with the fewest features and the shortest running time, while RCE-IFE-SVM (the SVM-adapted version of RCE-IFE) and SVM-RCE achieve similar AUCs of 0.84 and 0.83, respectively. RCE-IFE and SVM-RCE yield AUCs of 0.79 and 0.68, respectively when averaged over seven different metagenomics datasets, with RCE-IFE significantly reducing feature subsets. Furthermore, RCE-IFE surpasses several state-of-the-art FS methods, such as Minimum Redundancy Maximum Relevance (MRMR), Fast Correlation-Based Filter (FCBF), Information Gain (IG), Conditional Mutual Information Maximization (CMIM), SelectKBest (SKB), and eXtreme Gradient Boosting (XGBoost), obtaining an average AUC of 0.76 on five gene expression datasets. Compared with a similar tool, Multi-stage, RCE-IFE gives a similar average accuracy rate of 89.27% using fewer features on four cancer-related datasets. The comparability of RCE-IFE is also verified with other biological domain knowledge-based Grouping-Scoring-Modeling (G-S-M) tools, including mirGediNET, 3Mint, and miRcorrNet. Additionally, the biological relevance of the selected features by RCE-IFE is evaluated. The proposed method also exhibits high consistency in terms of the selected features across multiple runs. Our experimental findings imply that RCE-IFE provides robust classifier performance and significantly reduces feature size while maintaining feature relevance and consistency. | en_US |
| dc.description.sponsorship | Zefat Academic College; Abdullah Gul University Support Foundation (AGUV) | en_US |
| dc.description.sponsorship | This work has been supported by the Zefat Academic College. Burcu Bakir-Gungor's work has been supported by the Abdullah Gul University Support Foundation (AGUV). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript. | en_US |
| dc.identifier.doi | 10.7717/peerj-cs.2528 | |
| dc.identifier.issn | 2376-5992 | |
| dc.identifier.issn | 2376-5992 | |
| dc.identifier.scopus | 2-s2.0-85219220331 | |
| dc.identifier.uri | https://doi.org/10.7717/peerj-cs.2528 | |
| dc.identifier.uri | https://hdl.handle.net/20.500.12573/4519 | |
| dc.identifier.uri | https://doi.org/10.7717/PEERJ-CS.2528 | |
| dc.language.iso | en | en_US |
| dc.publisher | PeerJ Inc | en_US |
| dc.relation.ispartof | PeerJ Computer Science | en_US |
| dc.rights | info:eu-repo/semantics/openAccess | en_US |
| dc.subject | Feature Grouping | en_US |
| dc.subject | Feature Selection | en_US |
| dc.subject | Recursive Cluster Elimination | en_US |
| dc.subject | Intra-Cluster Feature Elimination | en_US |
| dc.subject | Disease | en_US |
| dc.title | RCE-IFE: Recursive Cluster Elimination With Intra-Cluster Feature Elimination | en_US |
| dc.type | Article | en_US |
| dspace.entity.type | Publication | |
| gdc.author.id | Yousef, Malik/0000-0001-8780-6303 | |
| gdc.author.id | Bakir-Gungor, Burcu/0000-0002-2272-6270 | |
| gdc.author.scopusid | 57219838821 | |
| gdc.author.scopusid | 25932029800 | |
| gdc.author.scopusid | 6603889265 | |
| gdc.author.scopusid | 14029389000 | |
| gdc.bip.impulseclass | C5 | |
| gdc.bip.influenceclass | C5 | |
| gdc.bip.popularityclass | C4 | |
| gdc.coar.access | open access | |
| gdc.coar.type | text::journal::journal article | |
| gdc.collaboration.industrial | false | |
| gdc.description.department | Abdullah Gül University | en_US |
| gdc.description.departmenttemp | [Kuzudisli, Cihan] Hasan Kalyoncu Univ, Fac Engn, Dept Comp Engn, Gaziantep, Turkiye; [Kuzudisli, Cihan] Abdullah Gul Univ, Dept Elect & Comp Engn, Kayseri, Turkiye; [Bakir-Gungor, Burcu] Abdullah Gul Univ, Fac Engn, Dept Comp Engn, Kayseri, Turkiye; [Qaqish, Bahjat] Univ North Carolina Chapel Hill, Dept Biostat, Chapel Hill, NC USA; [Yousef, Malik] Zefat Acad Coll, Dept Informat Syst, Safed, Israel; [Yousef, Malik] Zefat Acad Coll, Galilee Digital Hlth Res Ctr, Safed, Israel | en_US |
| gdc.description.publicationcategory | Makale - Uluslararası Hakemli Dergi - Kurum Öğretim Elemanı | en_US |
| gdc.description.scopusquality | Q1 | |
| gdc.description.startpage | e2528 | |
| gdc.description.volume | 11 | en_US |
| gdc.description.woscitationindex | Science Citation Index Expanded | |
| gdc.description.wosquality | Q2 | |
| gdc.identifier.openalex | W4407261119 | |
| gdc.identifier.pmid | 40062294 | |
| gdc.identifier.wos | WOS:001479755000001 | |
| gdc.index.type | WoS | |
| gdc.index.type | Scopus | |
| gdc.index.type | PubMed | |
| gdc.oaire.accesstype | GOLD | |
| gdc.oaire.diamondjournal | false | |
| gdc.oaire.impulse | 4.0 | |
| gdc.oaire.influence | 2.6831875E-9 | |
| gdc.oaire.isgreen | true | |
| gdc.oaire.keywords | Bioinformatics | |
| gdc.oaire.keywords | Electronic computers. Computer science | |
| gdc.oaire.keywords | Feature grouping | |
| gdc.oaire.keywords | Feature selection | |
| gdc.oaire.keywords | Recursive cluster elimination | |
| gdc.oaire.keywords | Disease | |
| gdc.oaire.keywords | QA75.5-76.95 | |
| gdc.oaire.keywords | Intra-cluster feature elimination | |
| gdc.oaire.popularity | 5.39098E-9 | |
| gdc.oaire.publicfunded | false | |
| gdc.openalex.collaboration | International | |
| gdc.openalex.fwci | 1.7031 | |
| gdc.openalex.normalizedpercentile | 0.81 | |
| gdc.openalex.toppercent | TOP 10% | |
| gdc.opencitations.count | 0 | |
| gdc.plumx.mendeley | 5 | |
| gdc.plumx.newscount | 1 | |
| gdc.plumx.scopuscites | 2 | |
| gdc.scopus.citedcount | 2 | |
| gdc.virtual.author | Güngör, Burcu | |
| gdc.wos.citedcount | 1 | |
| relation.isAuthorOfPublication | e17be1f8-1c9a-45f2-bf0d-f8b348d2dba0 | |
| relation.isAuthorOfPublication.latestForDiscovery | e17be1f8-1c9a-45f2-bf0d-f8b348d2dba0 | |
| relation.isOrgUnitOfPublication | 665d3039-05f8-4a25-9a3c-b9550bffecef | |
| relation.isOrgUnitOfPublication | 52f507ab-f278-4a1f-824c-44da2a86bd51 | |
| relation.isOrgUnitOfPublication | ef13a800-4c99-4124-81e0-3e25b33c0c2b | |
| relation.isOrgUnitOfPublication.latestForDiscovery | 665d3039-05f8-4a25-9a3c-b9550bffecef |
