1. Home
  2. Browse by Author

Browsing by Author "Görmez, Yasin"

Filter results by typing the first few letters
Now showing 1 - 8 of 8
  • Results Per Page
  • Sort Options
  • Loading...
    Thumbnail Image
    Conference Object
    Citation - Scopus: 1
    The Identification of Discriminative Single Nucleotide Polymorphism Sets for the Classification of Behçet's Disease
    (Institute of Electrical and Electronics Engineers Inc., 2018) Görmez, Yasin; Işik, Yunus Emre; Bakir-Güngör, Burcu
    Behçet's disease is a long-term multisystem inflammatory disorder, characterized by recurrent attacks affecting several organs. As the genotyping individuals get cheaper and easier following the developments in genomic technologies, genome-wide association studies (GWAS) emerged. By this means, via studying big-sized case-control groups for a specific disease, potential genetic variations, single nucleotide polymorphisms (SNPs) are identified. Although several genetic risk factors are identified for Behçet's disease with the help of these studies via scanning around a million of SNPs, these variations could only explain up to 20% of the disease's genetic risk. In this study, for Behçet's disease classification, via comparing all the SNPs genotyped in GWAS, with the SNPs selected via using genetic knowledge, gain ratio and information gain; both reduction in the feature size and improvement in the classification accuracy is aimed. Also, using different classification algorithms such as random forest, k-nearest neighbour and logistic regression, their effects on the classification accuracy are investigated. Our results showed that compared to other feature selection methods, with at least 81% success rate, the selection of the SNPs using the genetic information (of their GWAS p-values, indicating the significance of the SNP against the disease) provides 15% to 42% improvement in all classification algorithms. This improvement is statistically sound. While gain ratio and information gain feature selection techniques yield similar classification accuracies, the models using all SNPs could not exceed 50% accuracies and results in the worst performance. © 2019 Elsevier B.V., All rights reserved.
  • Loading...
    Thumbnail Image
    Conference Object
    Citation - WoS: 3
    Citation - Scopus: 12
    NSEM: Duygu Analizi için Özgün Yıǧınlanmiş Topluluk Yöntemi
    (Institute of Electrical and Electronics Engineers Inc., 2019) Işik, Yunus Emre; Görmez, Yasin; Kaynar, Oǧuz; Aydin, Zafer
    Today, people often share their ideas, opinions and feelings through forums, social media sites, blogs and similar platforms. For this reason, access to these data has become very easy. Increase in the number of shares makes it possible to analyze and use these data in terms of marketing and politics. However, due to the large number of data, it is impossible that this analysis will be done by humans. Determination of what type of emotion is included automatically is done by sentiment analysis methods. In these methods, the text is defined as a mathematical vector and classified by machine learning methods. Ensemble methods are one of the most important methods used as classifiers in sentiment analysis. In these methods, a classifier error is tried to be solved by another classifier. In sentiment analysis, the feature vector that describes the text is as important as the classifier. Feature vectors obtained using different methods can make mistakes in different places. For this reason, in this study, NSEM is proposed for sentiment analysis, which is a new ensemble method that uses 2 different classifiers and 2 different feature extraction methods. As a result of the analysis, the proposed method is the most successful method with an accuracy rate of 79.1%. © 2019 Elsevier B.V., All rights reserved.
  • Loading...
    Thumbnail Image
    Master Thesis
    Protein İkincil Yapı Tahmini için Boyut Küçültme
    (Abdullah Gül Üniversitesi, Fen Bilimleri Enstitüsü, 2017) Görmez, Yasin; Aydın, Zafer; Kaynar, Oğuz
    Gerekli metabolik süreçleri yürüten proteinler insan hayatı için büyük önem taşımaktadır. Proteinlerin fonksiyonları ile üç boyutlu yapıları arasında yakın bir ilişki bulunmaktadır. Dört yapı düzeyi olan proteinlerin bir çoğunun, birincil yapı olarak da adlandırılan amino asit dizilimi bilinmekte ancak üçüncül yapıları bilinmemektedir. Üçüncül yapıların laboratuvar ortamında tespit edilmesinin çok maliyetli ve zor olması, amino asit dizilimini kullanarak yapı tahmini yapan sistemlerin geliştirilmesine neden olmuştur. Protein yapı tahmini yapan sistemlerin en önemli aşamalarından biri ise ikincil yapı etiketlerinin tanımlanması işlemidir. Yeni öznitelik çıkarma yaklaşımları geliştirildikçe yapısal özelliklerin tahmini için kullanılan veri setleri yüksek boyutlara sahip olabilmekte ve kullanılan özniteliklerden bazıları gürültülü veri içerebilmektedir. Bu nedenle uygun sayıda ve doğru öznitelikleri seçmek, iyi bir başarı oranı elde etmek için önemli aşamalardan biridir. Bu çalışmada iki farklı veri seti üzerinde derin oto kodlayıcı kullanılarak boyut düşürme işlemi uygulanmış, temel bileşen analizi, ki-kare, bilgi kazancı, kazanım oranı, korelasyon tabanlı öznitelik seçim teknikleri ve minimum fazlalık maksimum ilgi algoritması gibi çeşitli öznitelik seçim ve boyut düşürme teknikleri ayrıca genetik algoritma, aç gözlü algoritma ve en iyi ilk önce algoritması gibi çeşitli arama stratejileri ile birlikte kullanılarak elde edilen veri setleri ile karşılaştırılmıştır. İkincil yapı tahmin başarısının karşılaştırılması için destek vektör makinası kullanılmıştır.
  • Loading...
    Thumbnail Image
    Conference Object
    Citation - Scopus: 3
    Protein İkincil Yapı Tahmini Için Makine Öǧrenmesi Yöntemlerinin Karşılaştırılması
    (Institute of Electrical and Electronics Engineers Inc., 2018) Aydin, Zafer; Kaynar, Oǧuz; Görmez, Yasin; Işik, Yunus Emre
    Three-dimensional structure prediction is one of the important problems in bioinformatics and theoretical chemistry. One of the most important steps in the three-dimensional structure prediction is the estimation of secondary structure. Due to rapidly growing databases and recent feature extraction methods datasets used for predicting secondary structure can potentially contain a large number of samples and dimensions. For this reason, it is important to use algorithms that are fast and accurate. In this study, various classification algorithms have been optimized for the second phase of a two-stage classifier on EVAset benchmark both in the original input space and in the space reduced using the information gain metric. The most accurate classifier is obtained as the support vector machine while the extreme learning machine is significantly faster in model training. © 2018 Elsevier B.V., All rights reserved.
  • Loading...
    Thumbnail Image
    Conference Object
    Citation - Scopus: 1
    Protein İkincil Yapı Tahmini için NR ve UniClust Veri Tabanlarının Karşılaştırılması
    (Institute of Electrical and Electronics Engineers Inc., 2018) Aydin, Zafer; Kaynar, Oǧuz; Görmez, Yasin
    Three-dimensional structure prediction is one of the important problems in bioinformatics and theoretical chemistry. One of the most important steps in the three-dimensional structure prediction is the estimation of secondary structure. Improving the accuracy rate in protein secondary structure prediction depends on computed attributes as well as the classification algorithms. In multiple alignment methods, which are often used to extract an attribute, the calculated values differ according to the database used for the alignment. For this reason, it is important to use a suitable database against which the target proteins are aligned to compute profile feature vectors. In this study, 5 different datasets are generated for the CB513 benchmark with the aid of two different alignment methods and three different databases. The profile features are fed as input to a two-stage hybrid classifier. According to the experimental results, the highest accuracy rate is obtained when UniClust database is used at the first stage of HHBlits alignment to calculate PSSM values and NR database is used at the first stage of HHBlits alignment to calculate structural profile matrices. © 2018 Elsevier B.V., All rights reserved.
  • Loading...
    Thumbnail Image
    Doctoral Thesis
    Protein Yapı Tahmini için Derin Öğrenme Modellerinin Geliştirilmesi
    (Abdullah Gül Üniversitesi, Fen Bilimleri Enstitüsü, 2022) Görmez, Yasin; Aydın, Zafer
    The three-dimensional structure of a protein provides important clues about the function of that protein. Although there have been many studies on protein structure prediction, the problem has still not been solved completely. As it is very difficult to predict the three-dimensional structure of a protein directly, predictions of structural properties of proteins such as secondary structure, solvent accessibility, and torsion angles are carried out first, which are later used as inputs to more elaborate structure estimation tasks. In this thesis, novel deep learning models have been developed by using convolutional neural networks (CNN), graph convolutional networks (GCN) and long-short-term memory (LSTM) recurrent neural networks to predict secondary structure, solvent accessibility and torsion angles of proteins. A rich feature set formed by using PSI-BLAST, HHBlits, physicochemical properties, structural profile matrices, AA index values, and graphs representing the relationship between amino acids were used as inputs to the models. In the first study, a deep learning model was developed by using CNN and GCN layers for secondary structure prediction. In the second study, LSTM layers were added to the first model, which was extended to make solvent accessibility and torsion angle predictions as well using the multi-task learning approach. In both studies, graphs were generated using neighborhood relations between amino acids. In the last study, a novel U-net-based model was designed for secondary structure prediction using CNN, GCN, and LSTM layers. The graph matrices used as input to GCN layers were obtained by using protein contact map prediction. All models were trained, optimized and tested on benchmark data sets. Improvements were obtained in accuracy as compared to the state-of-the-art
  • Loading...
    Thumbnail Image
    Book Part
    Citation - Scopus: 3
    ROSE: A Novel Approach for Protein Secondary Structure Prediction
    (Springer Science and Business Media Deutschland GmbH, 2021) Görmez, Yasin; Aydin, Zafer
    Three-dimensional structure of protein gives important information about protein’s function. Since it is time-consuming and costly to find the structure of protein by experimental methods, estimation of three-dimensional structures of proteins through computational methods has been an efficient alternative. One of the most important steps for the 3-D protein structure prediction is protein secondary structure prediction. Proteins which contain different number and sequences of amino acids may have similar structures. Thus, extracting appropriate input features has crucial importance for secondary structure prediction. In this study, a novel model, ROSE, is proposed for secondary structure prediction that obtains probability distributions as a feature vector by using two position specific scoring matrices obtained by PSIBLAST and HHblits. ROSE is a two-stage hybrid classifier that uses a one-dimensional bi-directional recurrent neural network at the first stage and a support vector machine at the second stage. It is also combined with DSPRED method, which employs dynamic Bayesian networks and a support vector machine. ROSE obtained comparable results to DSPRED in cross-validation experiments performed on a difficult benchmark and can be used as an alternative to protein secondary structure prediction. © 2021 Elsevier B.V., All rights reserved.
  • Loading...
    Thumbnail Image
    Article
    Sentiment Analizinde Öznitelik Düşürme Yöntemlerinin Oto Kodlayıcılı Derin Öğrenme Makinaları ile Karşılaştırılması
    (Gazi Üniversitesi, 2017) Kaynar, Oğuz; Aydın, Zafer; Görmez, Yasin
    -- Günümüz teknolojisinde internetin her kesim tarafından çok yoğun olarak kullanılmasından dolayı insanlar artık görüş, fikir ve hislerini sosyal paylaşım siteleri, forum, blog benzeri birçok ortam aracılığı ile paylaşmaya başlamıştır. Ancak her geçen gün artan veri sayısı ve boyutu, bu verilerden manuel olarak anlamlı bilgiler çıkartılmasını çok zahmetli ve pahalı bir iş haline getirmektedir. Otomatik olarak verinin duygu içerip içermediğinin saptanması ve bu duygunun olumlu, olumsuz veya tarafsız olma durumunun belirlenmesi duygu analizi yardımıyla gerçekleştirilmektedir. Duygu düşünce analizinde, konuşma dilinin karmaşıklığı, değerlendirilen metin sayısının fazlalığı ve uzunluğu, çok sayıda gereksiz ve gürültü içeren öznitelik vektörüne neden olmaktadır. Boyut problemi olarak adlandırılan bu durum hesaplama zamanın artmasına ve sınıflama hatalarına yol açmaktadır. Bu çalışmada ise bahsedilen problemlere çözüm olarak önerilen derin öğrenme tabanlı oto kodlayıcı (Autoencoder) modeli ile gürültü giderici oto kodlayıcı (Denoising Autoencoder) modeli boyut düşürme tekniği olarak kullanılmış ve literatürde yaygın olarak kullanılan diğer boyut düşürme teknikleri ile kıyaslanmıştır. Elde edilen tüm veri setleri için sınıflama algoritması olarak Destek Vektör Makinaları ve Yapay Sinir Ağları kullanan farklı modeller geliştirilmiştir. Yapılan analizlerin sonucunda, boyut düşürme tekniklerinin duygu analizi için elde edilen sonuçları iyileştirdiği, önerilen oto kodlayıcı modellerinin ise var olan tekniklere benzer ya da onlardan daha iyi sonuçlar aldığı gözlemlenmiştir