Application of Biological Domain Knowledge Based Feature Selection on Gene Expression Data

dc.contributor.author Yousef, Malik
dc.contributor.author Kumar, Abhishek
dc.contributor.author Bakir-Gungor, Burcu
dc.date.accessioned 2025-09-25T10:41:02Z
dc.date.available 2025-09-25T10:41:02Z
dc.date.issued 2021
dc.description Yousef, Malik/0000-0001-8780-6303; Kumar, Abhishek/0000-0003-4172-4059; en_US
dc.description.abstract In the last two decades, there have been massive advancements in high throughput technologies, which resulted in the exponential growth of public repositories of gene expression datasets for various phenotypes. It is possible to unravel biomarkers by comparing the gene expression levels under different conditions, such as disease vs. control, treated vs. not treated, drug A vs. drug B, etc. This problem refers to a well-studied problem in the machine learning domain, i.e., the feature selection problem. In biological data analysis, most of the computational feature selection methodologies were taken from other fields, without considering the nature of the biological data. Thus, integrative approaches that utilize the biological knowledge while performing feature selection are necessary for this kind of data. The main idea behind the integrative gene selection process is to generate a ranked list of genes considering both the statistical metrics that are applied to the gene expression data, and the biological background information which is provided as external datasets. One of the main goals of this review is to explore the existing methods that integrate different types of information in order to improve the identification of the biomolecular signatures of diseases and the discovery of new potential targets for treatment. These integrative approaches are expected to aid the prediction, diagnosis, and treatment of diseases, as well as to enlighten us on disease state dynamics, mechanisms of their onset and progression. The integration of various types of biological information will necessitate the development of novel techniques for integration and data analysis. Another aim of this review is to boost the bioinformatics community to develop new approaches for searching and determining significant groups/clusters of features based on one or more biological grouping functions. en_US
dc.description.sponsorship Ramalingaswami Re-Retry Faculty Fellowship [BT/RLF/Re-entry/38/2017]; Abdullah Gul University Support Foundation (AGUV); Zefat Academic College en_US
dc.description.sponsorship A.K. is recipient of Ramalingaswami Re-Retry Faculty Fellowship (Grant; BT/RLF/Re-entry/38/2017). The work of B.B.G. has been supported by the Abdullah Gul University Support Foundation (AGUV). The work of M.Y. has been supported by the Zefat Academic College. en_US
dc.description.sponsorship Abdullah Gul University Support Foundation; Zefat Academic College
dc.identifier.doi 10.3390/e23010002
dc.identifier.issn 1099-4300
dc.identifier.scopus 2-s2.0-85098633693
dc.identifier.uri https://doi.org/10.3390/e23010002
dc.identifier.uri https://hdl.handle.net/20.500.12573/3313
dc.language.iso en en_US
dc.publisher MDPI en_US
dc.relation.ispartof Entropy en_US
dc.rights info:eu-repo/semantics/openAccess en_US
dc.subject Feature Selection en_US
dc.subject Feature Ranking en_US
dc.subject Grouping en_US
dc.subject Clustering en_US
dc.subject Biological Knowledge en_US
dc.title Application of Biological Domain Knowledge Based Feature Selection on Gene Expression Data en_US
dc.type Article en_US
dspace.entity.type Publication
gdc.author.id Yousef, Malik/0000-0001-8780-6303
gdc.author.id Kumar, Abhishek/0000-0003-4172-4059
gdc.author.scopusid 14029389000
gdc.author.scopusid 25031697300
gdc.author.scopusid 25932029800
gdc.author.wosid Kumar, Abhishek/A-8727-2012
gdc.author.wosid Kumar, Abhishek/Hkm-8254-2023
gdc.bip.impulseclass C3
gdc.bip.influenceclass C4
gdc.bip.popularityclass C3
gdc.coar.access open access
gdc.coar.type text::journal::journal article
gdc.collaboration.industrial false
gdc.description.department Abdullah Gül University en_US
gdc.description.departmenttemp [Yousef, Malik] Zefat Acad Coll, Dept Informat Syst, IL-13206 Safed, Israel; [Yousef, Malik] Zefat Acad Coll, Galilee Digital Hlth Res Ctr GDH, IL-13206 Safed, Israel; [Kumar, Abhishek] Inst Bioinformat, Int Technol Pk, Bangalore 560066, Karnataka, India; [Kumar, Abhishek] Manipal Acad Higher Educ MAHE, Manipal 576104, India; [Bakir-Gungor, Burcu] Abdullah Gul Univ, Dept Comp Engn, Fac Engn, TR-38080 Kayseri, Turkey en_US
gdc.description.endpage 15
gdc.description.issue 1 en_US
gdc.description.publicationcategory Makale - Uluslararası Hakemli Dergi - Kurum Öğretim Elemanı en_US
gdc.description.scopusquality Q2
gdc.description.startpage 2
gdc.description.volume 23 en_US
gdc.description.woscitationindex Science Citation Index Expanded
gdc.description.wosquality Q2
gdc.identifier.openalex W3112932631
gdc.identifier.pmid 33374969
gdc.identifier.wos WOS:000610127500001
gdc.index.type WoS
gdc.index.type Scopus
gdc.index.type PubMed
gdc.oaire.accesstype GOLD
gdc.oaire.diamondjournal false
gdc.oaire.downloads 96
gdc.oaire.impulse 36.0
gdc.oaire.influence 4.5041473E-9
gdc.oaire.isgreen true
gdc.oaire.keywords molecular_biology
gdc.oaire.keywords Science
gdc.oaire.keywords Physics
gdc.oaire.keywords QC1-999
gdc.oaire.keywords Q
gdc.oaire.keywords biological knowledge
gdc.oaire.keywords Review
gdc.oaire.keywords Astrophysics
gdc.oaire.keywords feature ranking
gdc.oaire.keywords QB460-466
gdc.oaire.keywords feature selection
gdc.oaire.keywords grouping
gdc.oaire.keywords clustering
gdc.oaire.popularity 4.5212737E-8
gdc.oaire.publicfunded false
gdc.oaire.sciencefields 0301 basic medicine
gdc.oaire.sciencefields 0206 medical engineering
gdc.oaire.sciencefields 02 engineering and technology
gdc.oaire.sciencefields 03 medical and health sciences
gdc.oaire.views 162
gdc.openalex.collaboration International
gdc.openalex.fwci 4.0597
gdc.openalex.normalizedpercentile 0.95
gdc.openalex.toppercent TOP 10%
gdc.opencitations.count 50
gdc.plumx.crossrefcites 46
gdc.plumx.facebookshareslikecount 21
gdc.plumx.mendeley 43
gdc.plumx.pubmedcites 23
gdc.plumx.scopuscites 63
gdc.scopus.citedcount 64
gdc.virtual.author Güngör, Burcu
gdc.wos.citedcount 52
relation.isAuthorOfPublication e17be1f8-1c9a-45f2-bf0d-f8b348d2dba0
relation.isAuthorOfPublication.latestForDiscovery e17be1f8-1c9a-45f2-bf0d-f8b348d2dba0
relation.isOrgUnitOfPublication 665d3039-05f8-4a25-9a3c-b9550bffecef
relation.isOrgUnitOfPublication 52f507ab-f278-4a1f-824c-44da2a86bd51
relation.isOrgUnitOfPublication ef13a800-4c99-4124-81e0-3e25b33c0c2b
relation.isOrgUnitOfPublication.latestForDiscovery 665d3039-05f8-4a25-9a3c-b9550bffecef

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Application of Biological Domain Knowledge Based Feature Selection on Gene Expression Data.pdf
Size:
1.16 MB
Format:
Adobe Portable Document Format
Description:
Makale Dosyası

License bundle

Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
1.44 KB
Format:
Item-specific license agreed upon to submission
Description: