Scopus İndeksli Yayınlar Koleksiyonu

Permanent URI for this collectionhttps://hdl.handle.net/20.500.12573/395

Browse

Search Results

Now showing 1 - 8 of 8

Citation - WoS: 1
Citation - Scopus: 1
Prediction of Type 2 Diabetes Using Metagenomic Data and Identification of Taxonomic Biomarkers
(IEEE, 2024-05-15) Temiz, Mustafa; Kuzudisli, Cihan; Yousef, Malik; Bakir-Gungor, Burcu
Nowadays, different molecular levels of -omics data on diseases are generated and analyzing these data with machine learning methods is one of the popular research topics. Among these data, the use of metagenomic data to facilitate the diagnosis, detection and treatment of diseases is increasing day by day. Type 2 diabetes (T2D) is a chronic disease characterized by insulin resistance and progressive dysfunction of pancreatic beta cells. While the number of people with diabetes is increasing by around 8% annually, the cost of treating the disease is rising by 18% per year. Therefore, the number of studies on the diagnosis, development and progression of T2D is increasing over time. The aim of this study is to achieve higher machine learning performance by using fewer metagenomic features and to achieve better classification performance by reducing computational costs. In this study, we compare the performance of three different methods using T2D-related metagenomic data. First, the MetaPhlAn tool is used to calculate the taxonomic species and their relative abundances in each sample. The SVM-RCE, RCE-IFE and microBiomeGSM tools used in this study are methods that perform classification by grouping and scoring features and are known to work well on complex datasets. In this study, the best results were obtained with the RCE-IFE tool with an AUC of 0.72 with an average of 125 features information. In addition, key taxonomic species identified by these tools as associated with T2D are presented in comparison to the literature.
Citation - WoS: 1
Citation - Scopus: 1
The Identification of Discriminative Single Nucleotide Polymorphism Sets for the Classification of Behcet's Disease
(IEEE, 2018-09) Gormez, Yasin; Isik, Yunus Emre; Bakir-Gungor, Burcu
Behcet's disease is a long-term multisystem inflammatory disorder, characterized by recurrent attacks affecting several organs. As the genotyping individuals get cheaper and easier following the developments in genomic technologies, genome-wide association studies (GWAS) emerged. By this means, via studying big-sized case-control groups for a specific disease, potential genetic variations, single nucleotide polymorphisms (SNPs) are identified. Although several genetic risk factors are identified for Behcet's disease with the help of these studies via scanning around a million of SNPs, these variations could only explain up to 200/u of the disease's genetic risk. In this study, for Behcet's disease classification, via comparing all the SNPs genotyped in GWAS, with the SNPs selected via using genetic knowledge, gain ratio and information gain; both reduction in the feature size and improvement in the classification accuracy is aimed. Also, using different classification algorithms such as random forest, k-nearest neighbour and logistic regression, their effects on the classification accuracy are investigated. Our results showed that compared to other feature selection methods, with at least 81% success rate, the selection of the SNPs using the genetic information (of their GWAS p-values, indicating the significance of the SNP against the disease) provides 15% to 42% improvement in all classification algorithms. This improvement is statistically sound. While gain ratio and information gain feature selection techniques yield similar classification accuracies, the models using all SNPs could not exceed 50% accuracies and results in the worst performance.
Citation - WoS: 3
Citation - Scopus: 13
NSEM: Duygu Analizi için Özgün Yıǧınlanmiş Topluluk Yöntemi
(Institute of Electrical and Electronics Engineers Inc., 2018-09) Işik, Yunus Emre; Görmez, Yasin; Kaynar, Oǧuz; Aydin, Zafer; Emre Isik, Yunus
Today, people often share their ideas, opinions and feelings through forums, social media sites, blogs and similar platforms. For this reason, access to these data has become very easy. Increase in the number of shares makes it possible to analyze and use these data in terms of marketing and politics. However, due to the large number of data, it is impossible that this analysis will be done by humans. Determination of what type of emotion is included automatically is done by sentiment analysis methods. In these methods, the text is defined as a mathematical vector and classified by machine learning methods. Ensemble methods are one of the most important methods used as classifiers in sentiment analysis. In these methods, a classifier error is tried to be solved by another classifier. In sentiment analysis, the feature vector that describes the text is as important as the classifier. Feature vectors obtained using different methods can make mistakes in different places. For this reason, in this study, NSEM is proposed for sentiment analysis, which is a new ensemble method that uses 2 different classifiers and 2 different feature extraction methods. As a result of the analysis, the proposed method is the most successful method with an accuracy rate of 79.1%. © 2019 Elsevier B.V., All rights reserved.
Citation - Scopus: 2
Makine Öǧrenmesi Teknikleri Ile İnternet Servis Saǧlayıcısı için Müşteri Kayıp Tahmini
(Institute of Electrical and Electronics Engineers Inc., 2020-09) Göy, Gökhan; Kolukisa, Burak; Bahçevan, Cenk Anıl; Güngör, Vehbi Çağrı
With the developing technology in every fields, a competitive marketing environment has been arised. In this competitive environment, analyzing customer behavior has become vital. In particular, the ability to easily change any service provider has become very critical for the company to continue its existence. At the same time, the amount of financial resources spent on retaining customers much less than to obtain new clients. In this context, the traditional methods of examining vast amount of data obtained today for establishing decision support systems have lost their validities. In this study, we used a dataset which is provided by TurkNet serving as an internet service provider in Turkey. Various preprocessing steps has performed on this dataset and then classification algorithms ran. Afterwards results have obtained and compared. The results of these experiments analyzed in terms of the area under the curve value. In this context, the most successful classifier algorithm has been determined as the Random Trees algorithm with a value of 0.936. © 2020 Elsevier B.V., All rights reserved.
Citation - WoS: 8
Citation - Scopus: 9
Meme Kanseri Histopatolojik Görüntülerinin Bilgisayar Destekli Sınıflandırılması
(Institute of Electrical and Electronics Engineers Inc., 2017-10) Aksebzeci, Bekir Hakan; Kayaaltı, Ömer
Nowadays, one of the most common types of cancer is breast cancer. The early and accurate diagnosis of breast cancer has great importance in the treatment of the disease. In the diagnosis of breast cancer, histopathological analysis of cell and tissue specimens taken by biopsy is considered as the gold standard. Histopathological analysis is a tedious process that is highly dependent on the knowledge and experience of the pathologists. In this study; it is aimed to develop a computer-Aided system that can reduce the workload of pathologists and help them in their diagnosis. An image set containing benign and malignant tumor images of breast cancer has been studied. To perform texture analysis on tumor images; first order statistics, Gabor and gray-level co-occurrence matrix (GLCM) feature extraction methods have been applied. Then, various classifiers were applied to the obtained feature matrices and their performances were compared. The highest classification accuracy was achieved 82.06% by Random Forests classifier with feature combination of Gabor and GLCM methods. The results presented here show that computer-Assisted diagnosis of breast cancer is a promising field. © 2018 Elsevier B.V., All rights reserved.
Citation - WoS: 2
Citation - Scopus: 6
Makine Öğrenmesi Yöntemleri ile Kredi Kartı Sahteciliğinin Tespiti
(Institute of Electrical and Electronics Engineers Inc., 2019-09) Göy, Gökhan; Gezer, Cengiz; Güngör, Vehbi Çağrı
With the increase in credit card usage of people, the credit card transactions increase dramatically. It is difficult to identify fraudulent transactions among the vast amount of credit card transactions. Although credit card fraud is limited in number of transactions, it causes serious problems in terms of financial losses for individuals and organizations. Even though large number of studies has been conducted to solve this problem, there is no generally accepted solution. In this paper, a publicly available data set is used. The unbalance problem of the data set was solved by using hybrid sampling methods together. On this data set, comparative performance evaluations have been conducted. Different from other studies, the Area Under the Curve (AUC) metric, which expresses the success in such data sets, has also been used in addition to standard performance metrics. Since it is also important to quickly detect credit card fraud transactions; the running time of different methods is also presented as another performance metric. © 2020 Elsevier B.V., All rights reserved.
Citation - Scopus: 3
Protein İkincil Yapı Tahmini Için Makine Öǧrenmesi Yöntemlerinin Karşılaştırılması
(Institute of Electrical and Electronics Engineers Inc., 2018-05) Aydin, Zafer; Kaynar, Oǧuz; Görmez, Yasin; Işik, Yunus Emre
Three-dimensional structure prediction is one of the important problems in bioinformatics and theoretical chemistry. One of the most important steps in the three-dimensional structure prediction is the estimation of secondary structure. Due to rapidly growing databases and recent feature extraction methods datasets used for predicting secondary structure can potentially contain a large number of samples and dimensions. For this reason, it is important to use algorithms that are fast and accurate. In this study, various classification algorithms have been optimized for the second phase of a two-stage classifier on EVAset benchmark both in the original input space and in the space reduced using the information gain metric. The most accurate classifier is obtained as the support vector machine while the extreme learning machine is significantly faster in model training. © 2018 Elsevier B.V., All rights reserved.
Citation - Scopus: 1
Koroner Arter Hastalığı Tanısı İçin Alan Bilgisi İçeren Topluluk Öznitelik Seçim Yöntemi
(Institute of Electrical and Electronics Engineers Inc., 2020-10-05) Kolukisa, Burak; Güngör, Vehbi Çağrı; Bakir-Güngör, Burcu; Gungor, Burcu Bakir
Coronary Artery Disease (CAD) is the condition where, the heart is not fed enough as a result of the accumulation of fatty matter called atheroma in the walls of the arteries. In 2016, CAD accounts for 31% (17.9 million) of the world's total deaths and its diagnosis is difficult. It is estimated that approximately 23.6 million people will die from this disease in 2030. With the development of machine learning and data mining techniques, it might be possible to diagnose CAD inexpensively and easily via examining some physical and biochemical values. In this study, for the CAD classification problem, a novel ensemble feature selection methodology that incorporates domain knowledge is proposed. Via applying the proposed methodology on the UCI Cleveland CAD dataset and using different classification algorithms, performance metrics are compared. It is shown that in our experiments, when Multilayer Perceptron classifier is used with 9 selected features, our proposed solution reached 85.47% accuracy, 82.96% accuracy and 0.839 F-Measure. As a future work, we aim to generate a machine learning model that can quickly diagnose CAD on real-time data in hospitals. © 2021 Elsevier B.V., All rights reserved.

Scopus İndeksli Yayınlar Koleksiyonu

Browse

Filters

Settings

Sort By

Results per page

Search Results