RCE-IFE: Recursive Cluster Elimination With Intra-Cluster Feature Elimination

dc.contributor.author Kuzudisli, Cihan
dc.contributor.author Bakir-Gungor, Burcu
dc.contributor.author Qaqish, Bahjat
dc.contributor.author Yousef, Malik
dc.date.accessioned 2025-09-25T10:55:58Z
dc.date.available 2025-09-25T10:55:58Z
dc.date.issued 2025
dc.description Yousef, Malik/0000-0001-8780-6303 en_US
dc.description.abstract The computational and interpretational difficulties caused by the ever-increasing dimensionality of biological data generated by new technologies pose a significant challenge. Feature selection (FS) methods aim to reduce the dimension, and feature grouping has emerged as a foundation for FS techniques that seek to detect strong correlations among features and identify irrelevant features. In this work, we propose the Recursive Cluster Elimination with Intra-Cluster Feature Elimination (RCE-IFE) method that utilizes feature grouping and iterates grouping and elimination steps in a supervised context. We assess dimensionality reduction and discriminatory capabilities of RCE-IFE on various high-dimensional datasets from different biological domains. For a set of gene expression, MicroRNA (miRNA) expression, and methylation datasets, the performance of RCE-IFE is comparatively evaluated with RCE-IFE-SVM (the SVM-adapted version of RCE-IFE) and SVM-RCE. On average, RCE-IFE attains an area under the curve (AUC) of 0.85 among tested expression datasets with the fewest features and the shortest running time, while RCE-IFE-SVM (the SVM-adapted version of RCE-IFE) and SVM-RCE achieve similar AUCs of 0.84 and 0.83, respectively. RCE-IFE and SVM-RCE yield AUCs of 0.79 and 0.68, respectively when averaged over seven different metagenomics datasets, with RCE-IFE significantly reducing feature subsets. Furthermore, RCE-IFE surpasses several state-of-the-art FS methods, such as Minimum Redundancy Maximum Relevance (MRMR), Fast Correlation-Based Filter (FCBF), Information Gain (IG), Conditional Mutual Information Maximization (CMIM), SelectKBest (SKB), and eXtreme Gradient Boosting (XGBoost), obtaining an average AUC of 0.76 on five gene expression datasets. Compared with a similar tool, Multi-stage, RCE-IFE gives a similar average accuracy rate of 89.27% using fewer features on four cancer-related datasets. The comparability of RCE-IFE is also verified with other biological domain knowledge-based Grouping-Scoring-Modeling (G-S-M) tools, including mirGediNET, 3Mint, and miRcorrNet. Additionally, the biological relevance of the selected features by RCE-IFE is evaluated. The proposed method also exhibits high consistency in terms of the selected features across multiple runs. Our experimental findings imply that RCE-IFE provides robust classifier performance and significantly reduces feature size while maintaining feature relevance and consistency. en_US
dc.description.sponsorship Zefat Academic College; Abdullah Gul University Support Foundation (AGUV) en_US
dc.description.sponsorship This work has been supported by the Zefat Academic College. Burcu Bakir-Gungor's work has been supported by the Abdullah Gul University Support Foundation (AGUV). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript. en_US
dc.identifier.doi 10.7717/peerj-cs.2528
dc.identifier.issn 2376-5992
dc.identifier.issn 2376-5992
dc.identifier.scopus 2-s2.0-85219220331
dc.identifier.uri https://doi.org/10.7717/peerj-cs.2528
dc.identifier.uri https://hdl.handle.net/20.500.12573/4519
dc.identifier.uri https://doi.org/10.7717/PEERJ-CS.2528
dc.language.iso en en_US
dc.publisher PeerJ Inc en_US
dc.relation.ispartof PeerJ Computer Science en_US
dc.rights info:eu-repo/semantics/openAccess en_US
dc.subject Feature Grouping en_US
dc.subject Feature Selection en_US
dc.subject Recursive Cluster Elimination en_US
dc.subject Intra-Cluster Feature Elimination en_US
dc.subject Disease en_US
dc.title RCE-IFE: Recursive Cluster Elimination With Intra-Cluster Feature Elimination en_US
dc.type Article en_US
dspace.entity.type Publication
gdc.author.id Yousef, Malik/0000-0001-8780-6303
gdc.author.id Bakir-Gungor, Burcu/0000-0002-2272-6270
gdc.author.scopusid 57219838821
gdc.author.scopusid 25932029800
gdc.author.scopusid 6603889265
gdc.author.scopusid 14029389000
gdc.bip.impulseclass C5
gdc.bip.influenceclass C5
gdc.bip.popularityclass C4
gdc.coar.access open access
gdc.coar.type text::journal::journal article
gdc.collaboration.industrial false
gdc.description.department Abdullah Gül University en_US
gdc.description.departmenttemp [Kuzudisli, Cihan] Hasan Kalyoncu Univ, Fac Engn, Dept Comp Engn, Gaziantep, Turkiye; [Kuzudisli, Cihan] Abdullah Gul Univ, Dept Elect & Comp Engn, Kayseri, Turkiye; [Bakir-Gungor, Burcu] Abdullah Gul Univ, Fac Engn, Dept Comp Engn, Kayseri, Turkiye; [Qaqish, Bahjat] Univ North Carolina Chapel Hill, Dept Biostat, Chapel Hill, NC USA; [Yousef, Malik] Zefat Acad Coll, Dept Informat Syst, Safed, Israel; [Yousef, Malik] Zefat Acad Coll, Galilee Digital Hlth Res Ctr, Safed, Israel en_US
gdc.description.publicationcategory Makale - Uluslararası Hakemli Dergi - Kurum Öğretim Elemanı en_US
gdc.description.scopusquality Q1
gdc.description.startpage e2528
gdc.description.volume 11 en_US
gdc.description.woscitationindex Science Citation Index Expanded
gdc.description.wosquality Q2
gdc.identifier.openalex W4407261119
gdc.identifier.pmid 40062294
gdc.identifier.wos WOS:001479755000001
gdc.index.type WoS
gdc.index.type Scopus
gdc.index.type PubMed
gdc.oaire.accesstype GOLD
gdc.oaire.diamondjournal false
gdc.oaire.impulse 4.0
gdc.oaire.influence 2.6831875E-9
gdc.oaire.isgreen true
gdc.oaire.keywords Bioinformatics
gdc.oaire.keywords Electronic computers. Computer science
gdc.oaire.keywords Feature grouping
gdc.oaire.keywords Feature selection
gdc.oaire.keywords Recursive cluster elimination
gdc.oaire.keywords Disease
gdc.oaire.keywords QA75.5-76.95
gdc.oaire.keywords Intra-cluster feature elimination
gdc.oaire.popularity 5.39098E-9
gdc.oaire.publicfunded false
gdc.openalex.collaboration International
gdc.openalex.fwci 1.7031
gdc.openalex.normalizedpercentile 0.81
gdc.openalex.toppercent TOP 10%
gdc.opencitations.count 0
gdc.plumx.mendeley 5
gdc.plumx.newscount 1
gdc.plumx.scopuscites 2
gdc.scopus.citedcount 2
gdc.virtual.author Güngör, Burcu
gdc.wos.citedcount 1
relation.isAuthorOfPublication e17be1f8-1c9a-45f2-bf0d-f8b348d2dba0
relation.isAuthorOfPublication.latestForDiscovery e17be1f8-1c9a-45f2-bf0d-f8b348d2dba0
relation.isOrgUnitOfPublication 665d3039-05f8-4a25-9a3c-b9550bffecef
relation.isOrgUnitOfPublication 52f507ab-f278-4a1f-824c-44da2a86bd51
relation.isOrgUnitOfPublication ef13a800-4c99-4124-81e0-3e25b33c0c2b
relation.isOrgUnitOfPublication.latestForDiscovery 665d3039-05f8-4a25-9a3c-b9550bffecef

Files