Bilgisayar Mühendisliği Bölümü Koleksiyonu
Permanent URI for this collectionhttps://hdl.handle.net/20.500.12573/203
Browse
Browsing Bilgisayar Mühendisliği Bölümü Koleksiyonu by Author "0000-0001-7686-6298"
Now showing 1 - 18 of 18
- Results Per Page
- Sort Options
Article BAUM-2: a multilingual audio-visual affective face database(Kluwer Academic Publishers(SpringerLink), 2015) Eroglu Erdem, Cigdem; Turan, Cigdem; Aydin, Zafer; 0000-0001-7686-6298; AGÜ, Mühendislik Fakültesi, Bilgisayar Mühendisliği Bölümü; Aydin, ZaferAccess to audio-visual databases, which contain enough variety and are richly annotated is essential to assess the performance of algorithms in affective computing applications, which require emotion recognition from face and/or speech data. Most databases available today have been recorded under tightly controlled environments, are mostly acted and do not contain speech data. We first present a semi-automatic method that can extract audio-visual facial video clips from movies and TV programs in any language. The method is based on automatic detection and tracking of faces in a movie until the face is occluded or a scene cut occurs. We also created a video-based database, named as BAUM-2, which consists of annotated audio-visual facial clips in several languages. The collected clips simulate real-world conditions by containing various head poses, illumination conditions, accessories, temporary occlusions and subjects with a wide range of ages. The proposed semi-automatic affective clip extraction method can easily be used to extend the database to contain clips in other languages. We also created an image based facial expression database from the peak frames of the video clips, which is named as BAUM-2i. Baseline image and video-based facial expression recognition results using state-of-the art features and classifiers indicate that facial expression recognition under tough and close-to-natural conditions is quite challenging.Article Comparative analysis of machine learning approaches for predicting respiratory virus infection and symptom severity(PEERJ INC, 2023) Aydin, Zafer; Isik, Yunus Emre; 0000-0001-7686-6298; AGÜ, Mühendislik Fakültesi, Bilgisayar Mühendisliği Bölümü; Aydin, ZaferRespiratory diseases are among the major health problems causing a burden on hospitals. Diagnosis of infection and rapid prediction of severity without time-consuming clinical tests could be beneficial in preventing the spread and progression of the disease, especially in countries where health systems remain incapable. Personalized medicine studies involving statistics and computer technologies could help to address this need. In addition to individual studies, competitions are also held such as Dialogue for Reverse Engineering Assessment and Methods (DREAM) challenge which is a community-driven organization with a mission to research biology, bioinformatics, and biomedicine. One of these competitions was the Respiratory Viral DREAM Challenge, which aimed to develop early predictive biomarkers for respiratory virus infections. These efforts are promising, however, the prediction performance of the computational methods developed for detecting respiratory diseases still has room for improvement. In this study, we focused on improving the performance of predicting the infection and symptom severity of individuals infected with various respiratory viruses using gene expression data collected before and after exposure. The publicly available gene expression dataset in the Gene Expression Omnibus, named GSE73072, containing samples exposed to four respiratory viruses (H1N1, H3N2, human rhinovirus (HRV), and respiratory syncytial virus (RSV)) was used as input data. Various preprocessing methods and machine learning algorithms were implemented and compared to achieve the best prediction performance. The experimental results showed that the proposed approaches obtained a prediction performance of 0.9746 area under the precision-recall curve (AUPRC) for infection (i.e., shedding) prediction (SC-1), 0.9182 AUPRC for symptom class prediction (SC-2), and 0.6733 Pearson correlation for symptom score prediction (SC-3) by outperforming the best leaderboard scores of Respiratory Viral DREAM Challenge (a 4.48% improvement for SC-1, a 13.68% improvement for SC-2, and a 13.98% improvement for SC-3). Additionally, over-representation analysis (ORA), which is a statistical method for objectively determining whether certain genes are more prevalent in pre-defined sets such as pathways, was applied using the most significant genes selected by feature selection methods. The results show that pathways associated with the ‘adaptive immune system’ and ‘immune disease’ are strongly linked to pre-infection and symptom development. These findings contribute to our knowledge about predicting respiratory infections and are expected to facilitate the development of future studies that concentrate on predicting not only infections but also the associated symptoms.Other Comparison of Machine Learning Classifiers for Protein Secondary Structure Prediction(IEEE, 2018) Aydin, Zafer; Kaynar, Oguz; Gormez, Yasin; Isik, Yunus Emre; 0000-0001-7686-6298; AGÜ, Mühendislik Fakültesi, Bilgisayar Mühendisliği Bölümü; Aydin, ZaferProteinlerin üç boyutlu yapılarının tahmin edilmesi teorik kimya ve biyoenformatik için önemli problemlerden biridir. Protein yapı tahmininin en önemli aşamalarından biri ise ikincil yapı tahminidir. Protein veritabanlarındaki verilerin hızlı artışı ve yakın zamanda geliştirilen farklı öznitelik çıkarma yöntemleri neticesinde ikincil yapı tahmini için kullanılan veri setleri boyut ve örnek sayısı bakımından büyümektedir. Bu nedenle hızlı çalışan ve belirli bir doğruluk oranını sahip tahmin algoritmaların kullanılması önem kazanmaktadır. Bu çalışmada iki aşamalı hibrit bir sınıflandırıcının ikinci aşaması için çeşitli sınıflama algoritmaları, EVAset veri seti kullanılarak hem orijinal boyutlu uzayda hem de bilgi kazancı metriği ile boyutu düşürülen uzayda optimize edilmiştir. Elde edilen sonuçlar doğrultusunda en başarılı tahmin yöntemi destek vektör makinası olurken model eğitme süresi bakımından en hızlı yöntem aşırı öğrenme makinası olarak elde edilmiştir.Other Comparison of NR and UniClust Databases for Protein Secondary Structure Prediction(IEEE, 2018) Aydin, Zafer; Kaynar, Oguz; Gormez, Yasin; 0000-0001-7686-6298; AGÜ, Mühendislik Fakültesi, Bilgisayar Mühendisliği Bölümü; Aydin, ZaferProteinlerin üç boyutlu yapılarının tahmin edilmesi teorik kimya ve biyoenformatik için önemli problemlerden biridir. Üç boyutlu yapı tahminin en önemli aşamalarından biri ise ikincil yapı tahminidir. İkincil yapı tahmininde başarı oranının artırılması kullanılan sınıflama algoritması kadar, hesaplanan özniteliklere de bağlı olmaktadır. Öznitelik çıkarmak için sıkça kullanılan çoklu hizalama yöntemlerinde ise hesaplanan değerler, hizalama için kullanılan veri tabanına göre farklılık göstermektedir. Bu nedenle öznitelik matrisleri oluşturulurken uygun veri tabanın seçilmesi önem kazanmaktadır. Bu çalışmada CB513 veri seti kullanılarak iki farklı hizalama yöntemi ve üç farklı veri tabanı yardımı ile 5 farklı veri seti oluşturulmuş ve bu veri setleri iki aşamalı hibrit bir sınıflandırıcı kullanılarak karşılaştırılmıştır. Elde edilen sonuçlar doğrultusunda en iyi başarı oranı HHBlits hizalama yönteminin ilk aşamasında hesaplanacak PSSM değerleri için UniClust ve yapısal profil matrisleri için yine HHBlits’in ilk aşamasında NR veri tabanı kullanıldığında elde edilmiştir.conferenceobject.listelement.badge Constructing structural profiles for protein torsion angle prediction(SciTePress, 2015) Aydin, Zafer; Baker, David; Noble, William Stafford; 0000-0001-7686-6298; AGÜ, Mühendislik Fakültesi, Bilgisayar Mühendisliği Bölümü; Aydin, ZaferStructural frequency profiles provide important constraints on structural aspects of a protein and is receiving a growing interest in the structure prediction community. In this paper, we introduce new techniques for scoring templates that are later combined to form structural profiles of 7-state torsion angles. By employing various parameters of target-template alignments we improve the quality and accuracy of structural profiles considerably. The most effective technique is the scaling of templates by integer powers of sequence identity score in which the power parameter is adjusted with respect to the similarity interval of the target. Incorporating other alignment scores as multiplicative factors further improves the accuracy of profiles. After analyzing the individual strengths of various structural profile methods, we combine them with ab-initio predictions of 7-state torsion angles by a linear committee approach. We show that incorporating template information improves the accuracy of ab-initio predictions significantly at all levels of target-template similarity even when templates are distant from the target. Template scaling methods developed in this work can be applied in many other prediction tasks and in more advanced methods designed for computing structural profiles.Article A Deep Ensemble Approach for Long-Term Traffic Flow Prediction(SPRINGER, 2024) Cini, Nevin; Aydin, Zafer; 0000-0001-5348-4043; 0000-0001-7686-6298; AGÜ, Mühendislik Fakültesi, Bilgisayar Mühendisliği Bölümü; Cini, Nevin; Aydin, ZaferIn the last 50 years, with the growth of cities and increase in the number of vehicles and mobility, traffic has become troublesome. As a result, traffic flow prediction started to attract attention as an important research area. However, despite the extensive literature, traffic flow prediction still remains as an open research problem, specifically for long-term traffic flow prediction. Compared to the models developed for short-term traffic flow prediction, the number of models developed for long-term traffic flow prediction is very few. Based on this shortcoming, in this study, we focus on long-term traffic flow prediction and propose a novel deep ensemble model (DEM). In order to build this ensemble model, first, we developed a convolutional neural network (CNN), a long short-term memory (LSTM) network and a gated recurrent unit (GRU) network as deep learning models, which formed the base learners. In the next step, we combine the output of these models according to their individual forecasting success. We use another deep learning model to determine the success of the individual models. Our proposed model is a flexible ensemble prediction model that can be updated based on traffic data. To evaluate the performance of the proposed model, we use a publicly available dataset. Experimental results show that the developed DEM model has a mean square error of 0.06 and a mean absolute error of 0.15 for single-step prediction; it shows that achieves a mean square error of 0.25 and a mean absolute error of 0.32 for multi-step prediction. We compared our proposed model with many models in different categories; individual deep learning models (i.e., LSTM, CNN, GRU), selected traditional machine learning models (i.e., linear regression, decision tree regression, k-nearest-neighbors regression) and other ensemble models such as random-forest regression. These results also support the claim that ensemble learning models perform better than individual models.conferenceobject.listelement.badge Design of a Tri band 5-Fingers Shaped Microstrip Patch Antenna with an Adjustable Resistor(IEEE, 2014) Aoad, Ashrf; Aydin, Zafer; Korkmaz, Erdal; 0000-0001-7686-6298; AGÜ, Mühendislik Fakültesi, Bilgisayar Mühendisliği Bölümü; Aydin, ZaferThis paper presents a tri band 5-fingers shaped microstrip patch antenna, which resonates initially at dual band of 3.2 GHz and 5.2 GHz frequencies for VSWR < 2. The antenna is modified by adding an adjustable resistor between the conductor and the reflecting plane giving a third resonant frequency of 2.4 GHz. A decrease in the return loss at 2.4 GHz is observed by modifying the value of the resistance. Impedance bandwidth and the resonant frequencies are examined with respect to the variability of the parameters of the antenna and the position of the adjustable resistor. The size of the antenna has been reduced by 57.9% in length and 14.06% in width. The proposed antenna can be used for 4G, WLAN, and Wi-MAX The antenna is designed and optimized by using the commercial CST software.Article An effective colorectal polyp classification for histopathological images based on supervised contrastive learning(ELSEVIER, 2024) Yengec-Tasdemir, Sena Busra; Aydin, Zafer; Akay, Ebru; Dogan, Serkan; Yilmaz, Bulent; 0000-0001-7686-6298; 0000-0003-2954-1217; AGÜ, Mühendislik Fakültesi, Bilgisayar Mühendisliği Bölümü; Aydin, Zafer; Yilmaz, BulentEarly detection of colon adenomatous polyps is pivotal in reducing colon cancer risk. In this context, accurately distinguishing between adenomatous polyp subtypes, especially tubular and tubulovillous, from hyperplastic variants is crucial. This study introduces a cutting-edge computer-aided diagnosis system optimized for this task. Our system employs advanced Supervised Contrastive learning to ensure precise classification of colon histopathology images. Significantly, we have integrated the Big Transfer model, which has gained prominence for its exemplary adaptability to visual tasks in medical imaging. Our novel approach discerns between in-class and out-of-class images, thereby elevating its discriminatory power for polyp subtypes. We validated our system using two datasets: a specially curated one and the publicly accessible UniToPatho dataset. The results reveal that our model markedly surpasses traditional deep convolutional neural networks, registering classification accuracies of 87.1% and 70.3% for the custom and UniToPatho datasets, respectively. Such results emphasize the transformative potential of our model in polyp classification endeavors.Article An effective colorectal polyp classification for histopathological images based on supervised contrastive learning(Elsevier, 2024) Yengec-Tasdemir,Sena Busra; Aydin,Zafer; Akay,Ebru; Doğan,Serkan; Yilmaz,Bulent; 0000-0001-7686-6298; AGÜ, Fen Bilimleri Enstitüsü, Elektrik ve Bilgisayar Mühendisliği Ana Bilim Dalı; Aydın, Zafer; Yilmaz, BulentEarly detection of colon adenomatous polyps is pivotal in reducing colon cancer risk. In this context, accurately distinguishing between adenomatous polyp subtypes, especially tubular and tubulovillous, from hyperplastic variants is crucial. This study introduces a cutting-edge computer-aided diagnosis system optimized for this task. Our system employs advanced Supervised Contrastive learning to ensure precise classification of colon histopathology images. Significantly, we have integrated the Big Transfer model, which has gained prominence for its exemplary adaptability to visual tasks in medical imaging. Our novel approach discerns between in-class and out-of-class images, thereby elevating its discriminatory power for polyp subtypes. We validated our system using two datasets: a specially curated one and the publicly accessible UniToPatho dataset. The results reveal that our model markedly surpasses traditional deep convolutional neural networks, registering classification accuracies of 87.1% and 70.3% for the custom and UniToPatho datasets, respectively. Such results emphasize the transformative potential of our model in polyp classification endeavorsArticle IGPRED-MultiTask: A Deep Learning Model to Predict Protein Secondary Structure, Torsion Angles and Solvent Accessibility(IEEE COMPUTER SOC, 2023) Gormez, Yasin; Aydin, Zafer; 0000-0001-7686-6298; AGÜ, Mühendislik Fakültesi, Bilgisayar Mühendisliği Bölümü; Aydin, ZaferProtein secondary structure, solvent accessibility and torsion angle predictions are preliminary steps to predict 3D structure of a protein. Deep learning approaches have achieved significant improvements in predicting various features of protein structure. In this study, IGPRED-Multitask, a deep learning model with multi task learning architecture based on deep inception network, graph convolutional network and a bidirectional long short-term memory is proposed. Moreover, hyper-parameters of the model are fine-tuned using Bayesian optimization, which is faster and more effective than grid search. The same benchmark test data sets as in the OPUS-TASS paper including TEST2016, TEST2018, CASP12, CASP13, CASPFM, HARD68, CAMEO93, CAMEO93_HARD, as well as the train and validation sets, are used for fair comparison with the literature. Statistically significant improvements are observed in secondary structure prediction on 4 datasets, in phi angle prediction on 2 datasets and in psi angel prediction on 3 datasets compared to the state-of-the-art methods. For solvent accessibility prediction, TEST2016 and TEST2018 datasets are used only to assess the performance of the proposed model.Article Improved classification of colorectal polyps on histopathological images with ensemble learning and stain normalization(ELSEVIER IRELAND, 2023) Yengec-Tasdemir, Sena Busra; Aydin, Zafer; Akay, Ebru; Dogan, Serkan; Yilmaz, Bulent; 0000-0001-8322-4832; 0000-0001-7686-6298; 0000-0003-2954-1217; AGÜ, Mühendislik Fakültesi, Bilgisayar Mühendisliği Bölümü; Yengec-Tasdemir, Sena Busra; Aydin, Zafer; Yilmaz, BulentBackground and Objective: Early detection of colon adenomatous polyps is critically important because correct detection of it significantly reduces the potential of developing colon cancers in the future. The key challenge in the detection of adenomatous polyps is differentiating it from its visually similar counterpart, non-adenomatous tissues. Currently, it solely depends on the experience of the pathologist. To assist the pathologists, the objective of this work is to provide a novel non-knowledge-based Clinical Decision Support System (CDSS) for improved detection of adenomatous polyps on colon histopathology images. Methods: The domain shift problem arises when the train and test data are coming from different distributions of diverse settings and unequal color levels. This problem, which can be tackled by stain normalization techniques, restricts the machine learning models to attain higher classification accuracies. In this work, the proposed method integrates stain normalization techniques with ensemble of competitively accurate, scalable and robust variants of CNNs, ConvNexts. The improvement is empirically analyzed for five widely employed stain normalization techniques. The classification performance of the proposed method is evaluated on three datasets comprising more than 10k colon histopathology images. Results: The comprehensive experiments demonstrate that the proposed method outperforms the stateof-the-art deep convolutional neural network based models by attaining 95% classification accuracy on the curated dataset, and 91.1% and 90% on EBHI and UniToPatho public datasets, respectively. Conclusions: These results show that the proposed method can accurately classify colon adenomatous polyps on histopathology images. It retains remarkable performance scores even for different datasets coming from different distributions. This indicates that the model has a notable generalization ability. (c) 2023 Elsevier B.V. All rights reserved.Article Knowledge based response correction method for design of reconfigurable N-shaped microstrip patch antenna using inverse ANNs(WILEY, 2017) Aoad, Ashrf; Simsek, Murat; Aydin, Zafer; 0000-0001-7686-6298; AGÜ, Mühendislik Fakültesi, Bilgisayar Mühendisliği Bölümü; Aydin, ZaferArtificial neural networks (ANNs) have been often used for engineering design problems. In this work, an inverse model of a reconfigurable N-shaped microstrip patch antenna which is formed by ANN is considered to find design parameters. For this task, knowledge-based response correction consists of two steps, which include generating response using multilayer perceptron as a first step and correcting this response using knowledge based methods such as source difference, prior knowledge input, and prior knowledge input with difference as a second step. The proposed antenna has four states of operation controlled by two Positive-Intrinsic-Negative (PIN) diodes with ON/OFF states. The two-step ANN models are inversely trained using the optimum of the resonant frequency parameter as the input and the physical dimensions of the proposed antenna as outputs of the multilayer perceptron. The outputs and, in some methods, the input parameters of the multilayer perceptron are sent as input to the knowledge-based models while the obtained outputs from the two steps are the results of the new physical dimensions of the redesigned reconfigurable antenna that will be compared and analyzed. This input/output complexity of the proposed reconfigurable antenna allows an accurate and fast inverse model to be developed with less training data. Users may use this antenna and its ANN models to develop new products in the market where any frequency in the operating region can be given to the input to result an appropriate form of the new reconfigurable antenna.Article Network intrusion detection based on machine learning strategies: performance comparisons on imbalanced wired, wireless, and software-defined networking (SDN) network traffics(TÜBİTAK Academic Journals, 2024) Hacilar, Hilal; Aydin, Zafer; Gungor, Vehbi Cagri; 0000-0025-8116-722; 0000-0001-7686-6298; 0000-0003-0803-8372; AGÜ, Mühendislik Fakültesi, Bilgisayar Mühendisliği Bölümü; Hacilar, Hilal; Aydin, Zafer; Gungor, Vehbi CagriThe rapid growth of computer networks emphasizes the urgency of addressing security issues. Organizations rely on network intrusion detection systems (NIDSs) to protect sensitive data from unauthorized access and theft. These systems analyze network traffic to detect suspicious activities, such as attempted breaches or cyberattacks. However, existing studies lack a thorough assessment of class imbalances and classification performance for different types of network intrusions: wired, wireless, and software-defined networking (SDN). This research aims to fill this gap by examining these networks' imbalances, feature selection, and binary classification to enhance intrusion detection system efficiency. Various techniques such as SMOTE, ROS, ADASYN, and SMOTETomek are used to handle imbalanced datasets. Additionally, eXtreme Gradient Boosting (XGBoost) identifies key features, and an autoenco der (AE) assists in feature extraction for the classification task. The study evaluates datasets such as AWID, UNSW, and InSDN, yielding the best results with different numbers of selected features. Bayesian optimization fine-tunes parameters, and diverse machine learning algorithms (SVM, kNN, XGBoost, random forest, ensemble classifiers, and autoencoders) are employed. The optimal results, considering F1-measure, overall accuracy, detection rate, and false alarm rate, have been achieved for the UNSW-NB15, preprocessed AWID, and InSDN datasets, with values of [0.9356, 0.9289, 0.9328, 0.07597], [0.997, 0.9995, 0.9999, 0.0171], and [0.9998, 0.9996, 0.9998, 0.0012], respectively. These findings demonstrate that combining Bayesian optimization with oversampling techniques significantly enhances classification performance across wired, wireless, and SDN networks when compared to previous research conducted on these datasets.Article New modeling of reconfigurable microstrip antenna using hybrid structure of simulation driven and knowledge based artificial neural networks(PAMUKKALE UNIV, CAMPUS INCILIPINAR, DENIZLI, 20020, TURKEY, 2020) Aoad, Ashrf; Aydin, Zafer; 0000-0001-7686-6298; AGÜ, Mühendislik Fakültesi, Bilgisayar Mühendisliği BölümüKnowledge-based modeling has a critical role to embed existing knowledge to improve modeling performance. Since reconfigurable antenna can provide more operational frequencies than the classical antennas, a knowledge-based hybrid structure is used in this work to obtain efficient model and producing optimum new models for a reconfigurable microstrip antenna. The hybrid structure consists of two phases. The first phase generates initial knowledge which is used in knowledge-based modeling structure to obtain design parameters. Artificial neural network based multilayer perceptron can generate necessary knowledge for a knowledge-based model after the training process. Knowledge-based modeling improves the accuracy of the initial model to determine design parameters corresponding to the design target. Source difference, prior knowledge Input and prior knowledge input with difference can be applied to realize an efficient knowledge-based strategy. 3D-EM simulation generates the new model in terms of the design parameters of the proposed application. It has three switching states for operating, which are organized by two resistor circuits representing ON/OFF states. Switch positions and geometrical parameters can be used for satisfying design targets between 1 GHz and 6 GHz for the efficient antenna design.conferenceobject.listelement.badge A Novel Feature Design and Stacking Approach for Non-Technical Electricity Loss Detection(Institute of Electrical and Electronics Engineers Inc., 2018) Aydin, Zafer; Gungor, Vehbi Cagri; 0000-0001-7686-6298; 0000-0003-0803-8372; AGÜ, Mühendislik Fakültesi, Bilgisayar Mühendisliği Bölümü; Aydin, Zafer; Gungor, Vehbi CagriNon-technical electricity losses continue to jeopardize economic and social well-being of many countries. In this work, we develop machine learning classifiers that can identify anomalous electricity consumption in Turkey. Starting from weekly electricity usage data, we develop new features that capture statistical and frequency domain characteristics of the customers and their consumption patterns. We analyze the effect of reducing number of feature descriptors through dimensionality reduction and feature selection techniques. To overcome the class imbalance problem, we implement several ensemble methods and compare their prediction accuracy to those of the standard classifiers. The proposed features and combining strengths of different classifiers bring significant improvements on performance metrics, which is demonstrated through detailed simulations on shopping mall sector. We anticipate that advances in this field will contribute to the economies considerably.Article Performance Analysis of Machine Learning and Bioinformatics Applications on High Performance Computing Systems(Akademik Perspektif Derneği, 2020) Aydın, Zafer; 0000-0001-7686-6298; AGÜ, Mühendislik Fakültesi, Bilgisayar Mühendisliği Bölümü; Aydın, ZaferNowadays, it is becoming increasingly important to use the most efficient and most suitable computational resources for algorithmic tools that extract meaningful information from big data and make smart decisions. In this paper, a comparative analysis is provided for performance measurements of various machine learning and bioinformatics software including scikit-learn, Tensorflow, WEKA, libSVM, ThunderSVM, GMTK, PSI-BLAST, and HHblits with big data applications on different high performance computer systems and workstations. The programs are executed in a wide range of conditions such as single-core central processing unit (CPU), multi-core CPU, and graphical processing unit (GPU) depending on the availability of implementation. The optimum number of CPU cores are obtained for selected software. It is found that the running times depend on many factors including the CPU/GPU version, available RAM, the number of CPU cores allocated, and the algorithm used. If parallel implementations are available for a given software, the best running times are typically obtained by GPU, followed by multi-core CPU, and single-core CPU. Though there is no best system that performs better than others in all applications studied, it is anticipated that the results obtained will help researchers and practitioners to select the most appropriate computational resources for their machine learning and bioinformatics projects.Article Sentiment Analizinde Öznitelik Düşürme Yöntemlerinin Oto Kodlayıcılı Derin Öğrenme Makinaları ile Karşılaştırılması(Gazi Üniversitesi, 2017) Kaynar, Oğuz; Aydın, Zafer; Görmez, Yasin; 0000-0001-7686-6298; AGÜ, Mühendislik Fakültesi, Bilgisayar Mühendisliği Bölümü; Aydın, Zafer-- Günümüz teknolojisinde internetin her kesim tarafından çok yoğun olarak kullanılmasından dolayı insanlar artık görüş, fikir ve hislerini sosyal paylaşım siteleri, forum, blog benzeri birçok ortam aracılığı ile paylaşmaya başlamıştır. Ancak her geçen gün artan veri sayısı ve boyutu, bu verilerden manuel olarak anlamlı bilgiler çıkartılmasını çok zahmetli ve pahalı bir iş haline getirmektedir. Otomatik olarak verinin duygu içerip içermediğinin saptanması ve bu duygunun olumlu, olumsuz veya tarafsız olma durumunun belirlenmesi duygu analizi yardımıyla gerçekleştirilmektedir. Duygu düşünce analizinde, konuşma dilinin karmaşıklığı, değerlendirilen metin sayısının fazlalığı ve uzunluğu, çok sayıda gereksiz ve gürültü içeren öznitelik vektörüne neden olmaktadır. Boyut problemi olarak adlandırılan bu durum hesaplama zamanın artmasına ve sınıflama hatalarına yol açmaktadır. Bu çalışmada ise bahsedilen problemlere çözüm olarak önerilen derin öğrenme tabanlı oto kodlayıcı (Autoencoder) modeli ile gürültü giderici oto kodlayıcı (Denoising Autoencoder) modeli boyut düşürme tekniği olarak kullanılmış ve literatürde yaygın olarak kullanılan diğer boyut düşürme teknikleri ile kıyaslanmıştır. Elde edilen tüm veri setleri için sınıflama algoritması olarak Destek Vektör Makinaları ve Yapay Sinir Ağları kullanan farklı modeller geliştirilmiştir. Yapılan analizlerin sonucunda, boyut düşürme tekniklerinin duygu analizi için elde edilen sonuçları iyileştirdiği, önerilen oto kodlayıcı modellerinin ise var olan tekniklere benzer ya da onlardan daha iyi sonuçlar aldığı gözlemlenmiştirResearch Project Zenginleştirilmiş Öznitelikler ve Makine Öğrenmesi Yöntemleriyle Protein Yerel Yapı Tahmini(TUBİTAK, 2017) Aydın, Zafer; 0000-0001-7686-6298; AGÜ, Mühendislik Fakültesi, Bilgisayar Mühendisliği Bölümü; Aydın, ZaferProjenin amacı proteinlerde bulunan ikincil yapı, dihedral açı ve çözücü erişilirlik gibi bir boyutlu yapısal özelliklerin başarılı olarak tahmin edilmesi ve bu tahminleri kullanarak parçacık seçimi yapan yeni bir yöntem geliştirilmesidir. Geliştirilen yöntemler sayesinde proteinlerin üç boyutlu yapısının daha doğru tahmin edilmesi, proteinlerin fonksiyonlarının daha iyi anlaşılması ve daha etkili ilaç tasarımı yapılması mümkün olacaktır. Bir boyutlu yapısal özelliklerin tahmini için yürütücünün daha önce geliştirdiği iki aşamalı hibrit sınıflandırma yöntemi kullanılmıştır. Bu yöntemde bulunan sınıflandırıcılar için dizi tabanlı profiller, yapısal profil matrisleri gibi çeşitli öznitelik vektörleri kullanılmıştır. İkinci aşamadaki sınıflandırıcı için destek vektör makinası, derin KSA, rastgele orman ve topluluk gibi çeşitli öğrenme yöntemleri eğitilmiş ve geliştirilen yöntemlerin tahmin başarı oranları standart veri kümelerinde incelenmiştir. Ayrıca bu aşamada derin otokodlayıcılar ve öznitelik seçme yaklaşımları ile boyut düşürme gerçekleştirilmiştir. Protein parçacık seçimi için verilen iki amino asit dizisi parçacığının yapısal olarak benzer olup olmadığının tahmin eden yöntemler geliştirilmiştir. Bunun için Rosetta programının parçacık veritabanında bulunan proteinlerden parçacık ikilileri örneklenmiş, bu ikililer BCScore yöntemi ile etiketlenmiş, eğitim ve test kümeleri oluşturulmuştur. Ayrıca farklı öznitelik kümeleri konsept hiyerarşi yaklaşımı ile kapsamlı olarak incelenmiş ve en başarılı sonucu veren öznitelik kombinasyonları tespit edilmiştir. Parçacık seçimi probleminde 3 ve 9 amino asitlik parçacıklar üzerinde çalışılmıştır ancak yöntemler diğer uzunluktaki parçacıklar için de kolaylıkla uygulanabilecektir. Projede geliştirilen yöntemler sayesinde ikincil yapı tahmin başarısı en zor tahmin kategorisinde %2.6 iyileşmiş, dihedral açı tahmin başarısı önemli oranda iyileşmiş, çözücü erişilirlik probleminde literatürdeki en başarılı yöntemler ile benzer bir seviye yakalanmıştır. Parçacık seçiminde ise verilen iki parçacığın yapılarının benzer olup olmadıkları 3-mer parçacıklar için %94 ve 9merler içinse %97 oranı ile tahmin edilmiştir. Yapılan çalışmaların neticesinde öznitelik vektörlerinin daha iyi tasarlanmasının ve farklı sınıflandırma yöntemlerinin birleştirilip optimize edilmesinin yapısal özellik tahmin başarısını önemli oranda iyileştirdiği sonucuna varılmıştır.