Scopus İndeksli Yayınlar Koleksiyonu

Permanent URI for this collectionhttps://hdl.handle.net/20.500.12573/395

Browse

Search Results

Now showing 1 - 4 of 4
  • Conference Object
    Citation - Scopus: 1
    The Identification of Discriminative Single Nucleotide Polymorphism Sets for the Classification of Behçet's Disease
    (Institute of Electrical and Electronics Engineers Inc., 2018-09) Görmez, Yasin; Işik, Yunus Emre; Bakir-Güngör, Burcu
    Behçet's disease is a long-term multisystem inflammatory disorder, characterized by recurrent attacks affecting several organs. As the genotyping individuals get cheaper and easier following the developments in genomic technologies, genome-wide association studies (GWAS) emerged. By this means, via studying big-sized case-control groups for a specific disease, potential genetic variations, single nucleotide polymorphisms (SNPs) are identified. Although several genetic risk factors are identified for Behçet's disease with the help of these studies via scanning around a million of SNPs, these variations could only explain up to 20% of the disease's genetic risk. In this study, for Behçet's disease classification, via comparing all the SNPs genotyped in GWAS, with the SNPs selected via using genetic knowledge, gain ratio and information gain; both reduction in the feature size and improvement in the classification accuracy is aimed. Also, using different classification algorithms such as random forest, k-nearest neighbour and logistic regression, their effects on the classification accuracy are investigated. Our results showed that compared to other feature selection methods, with at least 81% success rate, the selection of the SNPs using the genetic information (of their GWAS p-values, indicating the significance of the SNP against the disease) provides 15% to 42% improvement in all classification algorithms. This improvement is statistically sound. While gain ratio and information gain feature selection techniques yield similar classification accuracies, the models using all SNPs could not exceed 50% accuracies and results in the worst performance. © 2019 Elsevier B.V., All rights reserved.
  • Conference Object
    Citation - Scopus: 5
    Identifying Taxonomic Biomarkers of Colorectal Cancer in Human Intestinal Microbiota Using Multiple Feature Selection Methods
    (Institute of Electrical and Electronics Engineers Inc., 2022-09-07) Jabeer, Amhar; Kocak, Aysegul; Akkaş, Huseyin; Yenisert, Ferhan; Nalbantoĝlu, Özkan Ufuk; Yousef, Malik; Bakir-Güngör, Burcu; Bakir Gungor, Burcu
    A variety of bacterial species called gut microbiota work together to maintain a steady intestinal environment. The gastrointestinal tract contains tremendous amount of different species including archaea, bacteria, fungi, and viruses. While these organisms are crucial immune system stabilizers, the dysbiosis of the intestinal flora has been related to gastrointestinal disorders including Colorectal cancer (CRC), intestinal cancer, irritable bowel syndrome and inflammatory bowel disease. In the last decade, next-generation sequencing (NGS) methods have accelerated the identification of human gut flora. CRC is a deathly condition that has been on the rise in the last century, affecting half a million people each year. Since early CRC diagnosis is critical for an effective treatment, there is an immediate requirement for a classification system that can expedite CRC diagnosis. In this study, via analyzing the available metagenomics data on CRC, we aim to facilitate the CRC diagnosis via finding biomarkers linked with CRC, and via building a classification model. We have obtained the metagenomic sequencing data of the healthy individuals and CRC patients from a metagenome-wide association analysis and we have classified this data according to the disease stages. Conditional Mutual Information Maximization (CMIM), Fast Correlation Based Filter (FCBF), Extreme Gradient Boosting (XGBoost), min redundancy max relevance (mRMR), Information Gain (IG) and Select K Best (SKB) feature selection algorithms were utilized to cope with the complexity of the features. We observed that the SKB, IG, and XGBoost techniques made significant contributions to decrease the microbiota in use for CRC diagnosis, thereby reducing cost and time. We realized that our Random Forest classifier outperformed Adaboost, Support Vector Machine, Decision Tree, Logitboost and stacking ensemble classifiers in terms of CRC classification performance. Our results reiterated some known and some potential microbiome associated mechanisms in CRC, which could aid the design of new diagnostics based on the microbiome. © 2022 Elsevier B.V., All rights reserved.
  • Conference Object
    Population Specific Classification of Colorectal Cancer With Meta-Analysis of Metagenomic Data
    (Institute of Electrical and Electronics Engineers Inc., 2023-10-11) Temiz, Mustafa; Yousef, Malik; Bakir-Güngör, Burcu
    Advances in next-generation sequencing and '-omics' technologies makes it possible to characterize the human gut microbiome. While some of these microorganisms are important regulators of our immune system, modulation of the microbiota leads to a variety of diseases. Colorectal cancer (CRC), the third most common cancer worldwide, is caused by genetic mutations, environmental conditions, and abnormalities in the gut microbiota. Using various machine learning methods and meta-analysis techniques, this study aims to build a classification model that can help in CRC diagnosis by analyzing metagenomic datasets of different populations obtained at the species level. Using 8 different countries and 9 different metagenomic datasets, 3 different meta-analyzes are performed: within-population, cross-population, and one population is selected for testing and the rest is used as a training dataset (LODO). For CRC classification, 4 different classification algorithms (Random Forest (RF), Logitboost, Adaboost, and Decision Tree (DT)) are used. The best performance among these methods was obtained with the Random Forest algorithm with an AUC of 0.98 by using JP for the training data set and JPN populations for the test data set in the cross-population performance evaluation. © 2023 Elsevier B.V., All rights reserved.
  • Conference Object
    Citation - Scopus: 1
    Koroner Arter Hastalığı Tanısı İçin Alan Bilgisi İçeren Topluluk Öznitelik Seçim Yöntemi
    (Institute of Electrical and Electronics Engineers Inc., 2020-10-05) Kolukisa, Burak; Güngör, Vehbi Çağrı; Bakir-Güngör, Burcu; Gungor, Burcu Bakir
    Coronary Artery Disease (CAD) is the condition where, the heart is not fed enough as a result of the accumulation of fatty matter called atheroma in the walls of the arteries. In 2016, CAD accounts for 31% (17.9 million) of the world's total deaths and its diagnosis is difficult. It is estimated that approximately 23.6 million people will die from this disease in 2030. With the development of machine learning and data mining techniques, it might be possible to diagnose CAD inexpensively and easily via examining some physical and biochemical values. In this study, for the CAD classification problem, a novel ensemble feature selection methodology that incorporates domain knowledge is proposed. Via applying the proposed methodology on the UCI Cleveland CAD dataset and using different classification algorithms, performance metrics are compared. It is shown that in our experiments, when Multilayer Perceptron classifier is used with 9 selected features, our proposed solution reached 85.47% accuracy, 82.96% accuracy and 0.839 F-Measure. As a future work, we aim to generate a machine learning model that can quickly diagnose CAD on real-time data in hospitals. © 2021 Elsevier B.V., All rights reserved.