Scopus İndeksli Yayınlar Koleksiyonu

Permanent URI for this collectionhttps://hdl.handle.net/20.500.12573/395

Browse

Search Results

Now showing 1 - 5 of 5
  • Article
    Citation - WoS: 20
    Citation - Scopus: 24
    miRdisNET: Discovering MicroRNA Biomarkers That Are Associated With Diseases Utilizing Biological Knowledge-Based Machine Learning
    (Frontiers Media S.A., 2023-01-12) Jabeer, Amhar; Temiz, Mustafa; Bakir-Gungor, Burcu; Yousef, Malik
    During recent years, biological experiments and increasing evidence have shown that MicroRNAs play an important role in the diagnosis and treatment of human complex diseases. Therefore, to diagnose and treat human complex diseases, it is necessary to reveal the associations between a specific disease and related miRNAs. Although current computational models based on machine learning attempt to determine miRNA-disease associations, the accuracy of these models need to be improved, and candidate miRNA-disease relations need to be evaluated from a biological perspective. In this paper, we propose a computational model named miRdisNET to predict potential miRNA-disease associations. Specifically, miRdisNET requires two types of data, i.e., miRNA expression profiles and known disease-miRNA associations as input files. First, we generate subsets of specific diseases by applying the grouping component. These subsets contain miRNA expressions with class labels associated with each specific disease. Then, we assign an importance score to each group by using a machine learning method for classification. Finally, we apply a modeling component and obtain outputs. One of the most important outputs of miRdisNET is the performance of miRNA-disease prediction. Compared with the existing methods, miRdisNET obtained the highest AUC value of .9998. Another output of miRdisNET is a list of significant miRNAs for disease under study. The miRNAs identified by miRdisNET are validated via referring to the gold-standard databases which hold information on experimentally verified MicroRNA-disease associations. miRdisNET has been developed to predict candidate miRNAs for new diseases, where miRNA-disease relation is not yet known. In addition, miRdisNET presents candidate disease-disease associations based on shared miRNA knowledge. The miRdisNET tool and other supplementary files are publicly available at: .
  • Conference Object
    The Effect of Different Classifiers on Recursive Cluster Elimination in the Analysis of Transcriptomic Data
    (Institute of Electrical and Electronics Engineers Inc., 2023-10-11) Bulut, Nurten; Bakir-Güngör, Burcu; Qaqish, Bahjat F.; Yousef, Malik
    Gene expression data with limited sample size and a large number of genes are frequently encountered in genetic studies. In such high-dimensional data, identification of genes that distinguish between disease states is a challenging task. Feature selection (FS) is a useful approach in dealing with high dimensionality. Support Vector Machines Recursive Cluster Elimination (SVM-RCE) is a technique for FS in high-dimensional data. The SVM-RCE approach has been utilized for identification of clusters of genes whose expression levels correlate with pathological state. A key step in SVM-RCE is the use of an SVM classifier to assign an area under the curve (AUC) score to each gene cluster based on its ability to predict class labels. In this study, we investigate the use of alternative classifiers in the cluster-scoring step. Specifically, we compare Support Vector Machines, Random Forest, XgBoost, Naive Bayes, and linear logistic regression. In addition to AUC score performance evaluation, the algorithms are compared in terms of the number of selected genes at different levels of clustering and in terms of the running time. © 2023 Elsevier B.V., All rights reserved.
  • Article
    Citation - WoS: 16
    Citation - Scopus: 21
    GeNetOntology: Identifying Affected Gene Ontology Terms via Grouping, Scoring, and Modeling of Gene Expression Data Utilizing Biological Knowledge-Based Machine Learning
    (Frontiers Media S.A., 2023-08-21) Ersoz, Nur Sebnem; Bakir-Gungor, Burcu; Yousef, Malik
    Introduction: Identifying significant sets of genes that are up/downregulated under specific conditions is vital to understand disease development mechanisms at the molecular level. Along this line, in order to analyze transcriptomic data, several computational feature selection (i.e., gene selection) methods have been proposed. On the other hand, uncovering the core functions of the selected genes provides a deep understanding of diseases. In order to address this problem, biological domain knowledge-based feature selection methods have been proposed. Unlike computational gene selection approaches, these domain knowledge-based methods take the underlying biology into account and integrate knowledge from external biological resources. Gene Ontology (GO) is one such biological resource that provides ontology terms for defining the molecular function, cellular component, and biological process of the gene product.Methods: In this study, we developed a tool named GeNetOntology which performs GO-based feature selection for gene expression data analysis. In the proposed approach, the process of Grouping, Scoring, and Modeling (G-S-M) is used to identify significant GO terms. GO information has been used as the grouping information, which has been embedded into a machine learning (ML) algorithm to select informative ontology terms. The genes annotated with the selected ontology terms have been used in the training part to carry out the classification task of the ML model. The output is an important set of ontologies for the two-class classification task applied to gene expression data for a given phenotype.Results: Our approach has been tested on 11 different gene expression datasets, and the results showed that GeNetOntology successfully identified important disease-related ontology terms to be used in the classification model.Discussion: GeNetOntology will assist geneticists and scientists to identify a range of disease-related genes and ontologies in transcriptomic data analysis, and it will also help doctors design diagnosis platforms and improve patient treatment plans.
  • Conference Object
    Enhancing Gene Expression Data Analysis Through SVM-Based Recursive Cluster Elimination and Weighted Center Approaches
    (Avestia Publishing, 2024-08) Yousef, Malik; Bulut, Nurten; Gungor, Burcu Bakir; Qaqish, Bahjat F.
    The complexity and high dimensionality of gene expression data pose significant challenges for effective feature selection and accurate classification in bioinformatics. This study introduces two novel algorithms, Support Vector Machine-Recursive Cluster Elimination (SVM-RCE) and its advanced version, SVM-RCE with Center Weights (SVM-RCE-CW), designed to optimize feature selection by leveraging clustering techniques and machine learning models. Both algorithms aim to reduce the feature space, thereby enhancing the interpretability and performance of classification models. We present a comprehensive comparison of these methods against traditional feature selection techniques, demonstrating their efficacy in achieving significant dimensionality reduction while maintaining or improving classification accuracy in several gene expression datasets. © 2024 Elsevier B.V., All rights reserved.
  • Conference Object
    Citation - Scopus: 2
    Effect of Recursive Cluster Elimination With Different Clustering Algorithms Applied to Gene Expression Data
    (Institute of Electrical and Electronics Engineers Inc., 2023-10-11) Kuzudisli, Cihan; Bakir-Güngör, Burcu; Qaqish, Bahjat F.; Yousef, Malik
    Feature selection (FS) is an effective tool in dealing with high dimensionality and reducing computational cost. Support Vector Machines-Recursive Cluster Elimination (SVM-RCE) is one of several algorithms that have been developed for FS in high dimensional data. SVM-RCE involves a clustering step which originally is k-means. Using various performance metrics, three alternative algorithms are evaluated in this context; k-medoids, Hierarchical Clustering (HC), and Gaussian Mixture Model (GMM). Comparisons will be carried out on five publicly available gene expression datasets. The results show that k-means in SVM-RCE obtains higher performance than other tested algorithms in terms of classification performance. Additionally, HC shows a similar performance to k-means. Our findings show superiority of using k-means. This study can contribute to the development of SVM-RCE with different variations, leading to decrease in the number of selected genes, and an increase in prediction performance. © 2023 Elsevier B.V., All rights reserved.