WoS İndeksli Yayınlar Koleksiyonu

Permanent URI for this collectionhttps://hdl.handle.net/20.500.12573/394

Browse

Search Results

Now showing 1 - 7 of 7
  • Conference Object
    Enhancing Complex Disease Group Scoring with Mirgedinet: A Multi-Algorithm Machine Learning Framework Based on the GSM Approach
    (IEEE, 2025-06-25) Qumsiyeh, Emma; Bakir-Gungor, Burcu; Yousef, Malik
    Integrating biological prior knowledge for disease gene associations has shown significant promise in discovering new biomarkers with potential translational applications. This work investigates the application of a multi-algorithm machine learning framework based on the Grouping-Scoring-Modeling (G-S-M) approach for improving the prediction of complex diseases. The study identifies the primary gene and miRNA interactions in various complex diseases with the help of miRGediNET, which is a machine-learning based tool that integrates data from three biological databases. Traditional methods have only focused on independence between features; the G-S-M method focuses on aggregating genes based on biological interactions, pinpointing the scoring of gene groups for a disease, and modeling its predictive capability using advanced machine learning algorithms. In this research paper, seven algorithms, including Support Vector Machine, Decision Tree, and CatBoost, were applied to eight datasets extracted from the GEO database. This framework proved very robust in ranking gene clusters, thus predicting critical biomarkers while doing 100-fold randomized cross-validation within the evaluation. The results indicate this approach's high potential for refining disease and supporting research for choosing the best algorithm that can provide biological insights and computational advances.
  • Conference Object
    Citation - WoS: 16
    Citation - Scopus: 20
    Machine Learning Analysis of Inflammatory Bowel Disease-Associated Metagenomics Dataset
    (Institute of Electrical and Electronics Engineers Inc., 2018-09) Hacilar, Hilal; Nalbantoĝlu, Özkan Ufuk; Bakir-Güngör, Burcu
    There is an ongoing interplay between humans and our microbial communities. The microorganisms living in our gut produce energy from our food, strengthen our immune system, break down foreign products, and release metabolites and hormones, which are significant for regulating our physiology. The shifts away from this 'healthy' gut microbiome is considered to be associated with many diseases. Inflammatory bowel diseases (IBD) including Crohn's disease and ulcerative colitis, are gut related disorders affecting the intestinal tract. Although some metagenomics studies are conducted on IBD recently, our current understanding of the precise relationships between the human gut microbiome and IBD remains limited. In this regard, the use of state-of-the art machine learning approaches became popular to address a variety of questions like early diagnosis of certain diseases using human microbiota. In this study, we investigate which subset of gut microbiota are mostly associated with IBD and if disease-associated biomarkers can be detected via applying state-of-the art machine learning algorithms and proper feature selection methods. © 2019 Elsevier B.V., All rights reserved.
  • Conference Object
    Citation - WoS: 2
    Citation - Scopus: 1
    Feature Selection for Protein Dihedral Angle Prediction
    (IEEE, 2017) Aydin, Zafer; Kaynar, Oguz; Gormez, Yasin
    Three-dimensional structure prediction has crucial importance for bioinformatics and theoretical chemistry. One of the main steps of three-dimensional structure prediction is dihedral (torsion) angle prediction. As new feature extraction methods are developed the dimension of the input space increases considerably yielding longer model training and less accurate models due to noisy or redundant features. In this study, feature selection is employed for dimensionality reduction on one of the established benchmarks of protein 1D structure prediction. Experimental results show that the feature selection improves the accuracy of protein dihedral angle class prediction by 2% and can eliminate up to %82 of the features when random forest classifier is used. Accurate prediction of dihedral angles will eventually contribute to protein structure prediction.
  • Conference Object
    Citation - WoS: 22
    Citation - Scopus: 52
    Evaluation of Classification Algorithms, Linear Discriminant Analysis and a New Hybrid Feature Selection Methodology for the Diagnosis of Coronary Artery Disease
    (Institute of Electrical and Electronics Engineers Inc., 2018-12) Kolukisa, Burak; Hacilar, Hilal; Göy, Gökhan; Kus, Mustafa; Bakir-Güngör, Burcu; Aral, Atilla; Güngör, Vehbi Çağrı
    According to the World Health Organization (WHO), 31% of the world's total deaths in 2016 (17.9 million) was due to cardiovascular diseases (CVD). With the development of information technologies, it has become possible to predict whether people have heart diseases or not by checking certain physical and biochemical values at a lower cost. In this study, we have evalated a set of different classification algorithms, linear discriminant analysis and proposed a new hybrid feature selection methodology for the diagnosis of coronary heart diseases (CHD). Throughout this research effort, using three publicly available Heart Disease diagnosis datasets (UCI Machine Learning Repository), we have conducted comparative performance evaluations in terms of accuracy, sensitivity, specificity, F-measure, AUC and running time. © 2023 Elsevier B.V., All rights reserved.
  • Conference Object
    Citation - WoS: 1
    Citation - Scopus: 1
    The Identification of Discriminative Single Nucleotide Polymorphism Sets for the Classification of Behcet's Disease
    (IEEE, 2018-09) Gormez, Yasin; Isik, Yunus Emre; Bakir-Gungor, Burcu
    Behcet's disease is a long-term multisystem inflammatory disorder, characterized by recurrent attacks affecting several organs. As the genotyping individuals get cheaper and easier following the developments in genomic technologies, genome-wide association studies (GWAS) emerged. By this means, via studying big-sized case-control groups for a specific disease, potential genetic variations, single nucleotide polymorphisms (SNPs) are identified. Although several genetic risk factors are identified for Behcet's disease with the help of these studies via scanning around a million of SNPs, these variations could only explain up to 200/u of the disease's genetic risk. In this study, for Behcet's disease classification, via comparing all the SNPs genotyped in GWAS, with the SNPs selected via using genetic knowledge, gain ratio and information gain; both reduction in the feature size and improvement in the classification accuracy is aimed. Also, using different classification algorithms such as random forest, k-nearest neighbour and logistic regression, their effects on the classification accuracy are investigated. Our results showed that compared to other feature selection methods, with at least 81% success rate, the selection of the SNPs using the genetic information (of their GWAS p-values, indicating the significance of the SNP against the disease) provides 15% to 42% improvement in all classification algorithms. This improvement is statistically sound. While gain ratio and information gain feature selection techniques yield similar classification accuracies, the models using all SNPs could not exceed 50% accuracies and results in the worst performance.
  • Conference Object
    Citation - WoS: 1
    Citation - Scopus: 5
    Credit Risk Analysis Based on Hybrid Classification: Case Studies on German and Turkish Credit Datasets
    (IEEE, 2018-05) Cetiner, Erkan; Kocak, Taskin; Gungor, V. Cagri
    In finance sector, credit risk analysis plays a major role in decision process. Banks and finance institutions gather large amounts of raw data from their customers. Data mining techniques can be employed to obtain useful information from this raw data. Several data mining techniques, such as support-vector machines (SVM), neural networks, naive-bayes, have already been used to classify customers. In this paper, we propose hybrid classification approaches, which try to combine several classifiers and ensemble learners to boost accuracy on classification results. Furthermore, we compare these approaches' performance with respect to their classification accuracy. We work with two diverse datasets; namely, German credit dataset and Turkish bank dataset. The goal of using such diverse dataset is to show generalization capabality of our approaches. Experimental results provide three important consequences. First, feature selection stage has a major role both on result accuracy and calculation complexity. Second, hybrid approaches have better generalability over single classifiers. Third, using SVM-Radial Basis Function (RBF) as the base classifier and a hybrid model member gives the best accuracy and type-1 accuracy results among others.
  • Conference Object
    Citation - Scopus: 3
    Protein İkincil Yapı Tahmini Için Makine Öǧrenmesi Yöntemlerinin Karşılaştırılması
    (Institute of Electrical and Electronics Engineers Inc., 2018-05) Aydin, Zafer; Kaynar, Oǧuz; Görmez, Yasin; Işik, Yunus Emre
    Three-dimensional structure prediction is one of the important problems in bioinformatics and theoretical chemistry. One of the most important steps in the three-dimensional structure prediction is the estimation of secondary structure. Due to rapidly growing databases and recent feature extraction methods datasets used for predicting secondary structure can potentially contain a large number of samples and dimensions. For this reason, it is important to use algorithms that are fast and accurate. In this study, various classification algorithms have been optimized for the second phase of a two-stage classifier on EVAset benchmark both in the original input space and in the space reduced using the information gain metric. The most accurate classifier is obtained as the support vector machine while the extreme learning machine is significantly faster in model training. © 2018 Elsevier B.V., All rights reserved.