Browsing by Author "Aydin, Zafer"
Now showing 1 - 20 of 37
- Results Per Page
- Sort Options
Article BAUM-2: a multilingual audio-visual affective face database(Kluwer Academic Publishers(SpringerLink), 2015) Eroglu Erdem, Cigdem; Turan, Cigdem; Aydin, Zafer; 0000-0001-7686-6298; AGÜ, Mühendislik Fakültesi, Bilgisayar Mühendisliği Bölümü; Aydin, ZaferAccess to audio-visual databases, which contain enough variety and are richly annotated is essential to assess the performance of algorithms in affective computing applications, which require emotion recognition from face and/or speech data. Most databases available today have been recorded under tightly controlled environments, are mostly acted and do not contain speech data. We first present a semi-automatic method that can extract audio-visual facial video clips from movies and TV programs in any language. The method is based on automatic detection and tracking of faces in a movie until the face is occluded or a scene cut occurs. We also created a video-based database, named as BAUM-2, which consists of annotated audio-visual facial clips in several languages. The collected clips simulate real-world conditions by containing various head poses, illumination conditions, accessories, temporary occlusions and subjects with a wide range of ages. The proposed semi-automatic affective clip extraction method can easily be used to extend the database to contain clips in other languages. We also created an image based facial expression database from the peak frames of the video clips, which is named as BAUM-2i. Baseline image and video-based facial expression recognition results using state-of-the art features and classifiers indicate that facial expression recognition under tough and close-to-natural conditions is quite challenging.conferenceobject.listelement.badge Ceph-based Storage Server Application(IEEE, 345 E 47TH ST, NEW YORK, NY 10017 USA, 2018) Azginoglu, Nuh; Eren, Mehmet Akif; Celik, Mete; Aydin, Zafer; 0000-0002-4074-7366; 0000-0002-1488-1502; AGÜ, Mühendislik Fakültesi, Bilgisayar Mühendisliği BölümüCeph is a scalable and high performance distributed file system. In this study, a Ceph-based storage server was implemented and used actively. This storage system has been used as a disk of 40 virtual servers in 4 different Proxmox servers. Performance evaluation of the system has been conducted on virtual servers that holds Windows and Linux based operating systems.conferenceobject.listelement.badge Combining Classifiers for Protein Secondary Structure Prediction(IEEE345 E 47TH ST, NEW YORK, NY 10017 USA, 2017) Aydin, Zafer; Uzut, Ommu Gulsum; AGÜ, Mühendislik Fakültesi, Bilgisayar Mühendisliği BölümüProtein secondary structure prediction is an important step in estimating the three dimensional structure of proteins. Among the many methods developed for predicting structural properties of proteins, hybrid classifiers and ensembles that combine predictions from several models are shown to improve the accuracy rates. In this paper, we train, optimize and combine a support vector machine, a deep convolutional neural field and a random forest in the second stage of a hybrid classifier for protein secondary structure prediction. We demonstrate that the overall accuracy of the proposed ensemble is comparable to the success rates of the state-of-the-art methods in the most difficult prediction setting and combining the selected models have the potential to further improve the accuracy of the base learners.Article Comparative analysis of machine learning approaches for predicting respiratory virus infection and symptom severity(PEERJ INC, 2023) Aydin, Zafer; Isik, Yunus Emre; 0000-0001-7686-6298; AGÜ, Mühendislik Fakültesi, Bilgisayar Mühendisliği Bölümü; Aydin, ZaferRespiratory diseases are among the major health problems causing a burden on hospitals. Diagnosis of infection and rapid prediction of severity without time-consuming clinical tests could be beneficial in preventing the spread and progression of the disease, especially in countries where health systems remain incapable. Personalized medicine studies involving statistics and computer technologies could help to address this need. In addition to individual studies, competitions are also held such as Dialogue for Reverse Engineering Assessment and Methods (DREAM) challenge which is a community-driven organization with a mission to research biology, bioinformatics, and biomedicine. One of these competitions was the Respiratory Viral DREAM Challenge, which aimed to develop early predictive biomarkers for respiratory virus infections. These efforts are promising, however, the prediction performance of the computational methods developed for detecting respiratory diseases still has room for improvement. In this study, we focused on improving the performance of predicting the infection and symptom severity of individuals infected with various respiratory viruses using gene expression data collected before and after exposure. The publicly available gene expression dataset in the Gene Expression Omnibus, named GSE73072, containing samples exposed to four respiratory viruses (H1N1, H3N2, human rhinovirus (HRV), and respiratory syncytial virus (RSV)) was used as input data. Various preprocessing methods and machine learning algorithms were implemented and compared to achieve the best prediction performance. The experimental results showed that the proposed approaches obtained a prediction performance of 0.9746 area under the precision-recall curve (AUPRC) for infection (i.e., shedding) prediction (SC-1), 0.9182 AUPRC for symptom class prediction (SC-2), and 0.6733 Pearson correlation for symptom score prediction (SC-3) by outperforming the best leaderboard scores of Respiratory Viral DREAM Challenge (a 4.48% improvement for SC-1, a 13.68% improvement for SC-2, and a 13.98% improvement for SC-3). Additionally, over-representation analysis (ORA), which is a statistical method for objectively determining whether certain genes are more prevalent in pre-defined sets such as pathways, was applied using the most significant genes selected by feature selection methods. The results show that pathways associated with the ‘adaptive immune system’ and ‘immune disease’ are strongly linked to pre-infection and symptom development. These findings contribute to our knowledge about predicting respiratory infections and are expected to facilitate the development of future studies that concentrate on predicting not only infections but also the associated symptoms.Other Comparison of Machine Learning Classifiers for Protein Secondary Structure Prediction(IEEE, 2018) Aydin, Zafer; Kaynar, Oguz; Gormez, Yasin; Isik, Yunus Emre; 0000-0001-7686-6298; AGÜ, Mühendislik Fakültesi, Bilgisayar Mühendisliği Bölümü; Aydin, ZaferProteinlerin üç boyutlu yapılarının tahmin edilmesi teorik kimya ve biyoenformatik için önemli problemlerden biridir. Protein yapı tahmininin en önemli aşamalarından biri ise ikincil yapı tahminidir. Protein veritabanlarındaki verilerin hızlı artışı ve yakın zamanda geliştirilen farklı öznitelik çıkarma yöntemleri neticesinde ikincil yapı tahmini için kullanılan veri setleri boyut ve örnek sayısı bakımından büyümektedir. Bu nedenle hızlı çalışan ve belirli bir doğruluk oranını sahip tahmin algoritmaların kullanılması önem kazanmaktadır. Bu çalışmada iki aşamalı hibrit bir sınıflandırıcının ikinci aşaması için çeşitli sınıflama algoritmaları, EVAset veri seti kullanılarak hem orijinal boyutlu uzayda hem de bilgi kazancı metriği ile boyutu düşürülen uzayda optimize edilmiştir. Elde edilen sonuçlar doğrultusunda en başarılı tahmin yöntemi destek vektör makinası olurken model eğitme süresi bakımından en hızlı yöntem aşırı öğrenme makinası olarak elde edilmiştir.Other Comparison of NR and UniClust Databases for Protein Secondary Structure Prediction(IEEE, 2018) Aydin, Zafer; Kaynar, Oguz; Gormez, Yasin; 0000-0001-7686-6298; AGÜ, Mühendislik Fakültesi, Bilgisayar Mühendisliği Bölümü; Aydin, ZaferProteinlerin üç boyutlu yapılarının tahmin edilmesi teorik kimya ve biyoenformatik için önemli problemlerden biridir. Üç boyutlu yapı tahminin en önemli aşamalarından biri ise ikincil yapı tahminidir. İkincil yapı tahmininde başarı oranının artırılması kullanılan sınıflama algoritması kadar, hesaplanan özniteliklere de bağlı olmaktadır. Öznitelik çıkarmak için sıkça kullanılan çoklu hizalama yöntemlerinde ise hesaplanan değerler, hizalama için kullanılan veri tabanına göre farklılık göstermektedir. Bu nedenle öznitelik matrisleri oluşturulurken uygun veri tabanın seçilmesi önem kazanmaktadır. Bu çalışmada CB513 veri seti kullanılarak iki farklı hizalama yöntemi ve üç farklı veri tabanı yardımı ile 5 farklı veri seti oluşturulmuş ve bu veri setleri iki aşamalı hibrit bir sınıflandırıcı kullanılarak karşılaştırılmıştır. Elde edilen sonuçlar doğrultusunda en iyi başarı oranı HHBlits hizalama yönteminin ilk aşamasında hesaplanacak PSSM değerleri için UniClust ve yapısal profil matrisleri için yine HHBlits’in ilk aşamasında NR veri tabanı kullanıldığında elde edilmiştir.conferenceobject.listelement.badge Constructing structural profiles for protein torsion angle prediction(SciTePress, 2015) Aydin, Zafer; Baker, David; Noble, William Stafford; 0000-0001-7686-6298; AGÜ, Mühendislik Fakültesi, Bilgisayar Mühendisliği Bölümü; Aydin, ZaferStructural frequency profiles provide important constraints on structural aspects of a protein and is receiving a growing interest in the structure prediction community. In this paper, we introduce new techniques for scoring templates that are later combined to form structural profiles of 7-state torsion angles. By employing various parameters of target-template alignments we improve the quality and accuracy of structural profiles considerably. The most effective technique is the scaling of templates by integer powers of sequence identity score in which the power parameter is adjusted with respect to the similarity interval of the target. Incorporating other alignment scores as multiplicative factors further improves the accuracy of profiles. After analyzing the individual strengths of various structural profile methods, we combine them with ab-initio predictions of 7-state torsion angles by a linear committee approach. We show that incorporating template information improves the accuracy of ab-initio predictions significantly at all levels of target-template similarity even when templates are distant from the target. Template scaling methods developed in this work can be applied in many other prediction tasks and in more advanced methods designed for computing structural profiles.Article A Continuously Benchmarked and Crowdsourced Challenge for Rapid Development and Evaluation of Models to Predict COVID-19 Diagnosis and Hospitalization(AMER MEDICAL ASSOC330 N WABASH AVE, STE 39300, CHICAGO, IL 60611-5885, 2021) Yan, Yao; Schaffter, Thomas; Bergquist, Timothy; Yu, Thomas; Prosser, Justin; Aydin, Zafer; Jabeer, Amhar; Brugere, Ivan; Gao, Jifan; Chen, Guanhua; Causey, Jason; Yao, Yuxin; Bryson, Kevin; Long, Dustin R.; Jarvik, Jeffrey G.; Lee, Christoph, I; Wilcox, Adam; Guinney, Justin; Mooney, Sean; AGÜ, Mühendislik Fakültesi, Bilgisayar Mühendisliği Bölümü; Aydin, Zafer; Jabeer, AmharIMPORTANCE Machine learning could be used to predict the likelihood of diagnosis and severity of illness. Lack of COVID-19 patient data has hindered the data science community in developing models to aid in the response to the pandemic. OBJECTIVES To describe the rapid development and evaluation of clinical algorithms to predict COVID-19 diagnosis and hospitalization using patient data by citizen scientists, provide an unbiased assessment of model performance, and benchmark model performance on subgroups. DESIGN, SETTING, AND PARTICIPANTS This diagnostic and prognostic study operated a continuous, crowdsourced challenge using a model-to-data approach to securely enable the use of regularly updated COVID-19 patient data from the University of Washington by participants from May 6 to December 23, 2020. A postchallenge analysis was conducted from December 24, 2020, to April 7, 2021, to assess the generalizability of models on the cumulative data set as well as subgroups stratified by age, sex, race, and time of COVID-19 test. By December 23, 2020, this challenge engaged 482 participants from 90 teams and 7 countries. MAIN OUTCOMES AND MEASURES Machine learning algorithms used patient data and output a score that represented the probability of patients receiving a positive COVID-19 test result or being hospitalized within 21 days after receiving a positive COVID-19 test result. Algorithms were evaluated using area under the receiver operating characteristic curve (AUROC) and area under the precision recall curve (AUPRC) scores. Ensemble models aggregating models from the top challenge teams were developed and evaluated. RESULTS In the analysis using the cumulative data set, the best performance for COVID-19 diagnosis prediction was an AUROC of 0.776 (95% CI, 0.775-0.777) and an AUPRC of 0.297, and for hospitalization prediction, an AUROC of 0.796 (95% CI, 0.794-0.798) and an AUPRC of 0.188. Analysis on top models submitting to the challenge showed consistently better model performance on the female group than the male group. Among all age groups, the best performance was obtained for the 25- to 49-year age group, and the worst performance was obtained for the group aged 17 years or younger. CONCLUSIONS AND RELEVANCE In this diagnostic and prognostic study, models submitted by citizen scientists achieved high performance for the prediction of COVID-19 testing and hospitalization outcomes. Evaluation of challenge models on demographic subgroups and prospective data revealed performance discrepancies, providing insights into the potential bias and limitations in the models.Article Crowdsourcing digital health measures to predict Parkinson's disease severity: the Parkinson's Disease Digital Biomarker DREAM Challenge(NATURE RESEARCHHEIDELBERGER PLATZ 3, BERLIN 14197, GERMANY, 2021) Aydin, Zafer; Sieberts, Solveig K.; Schaff, Jennifer; Duda, Marlena; Pataki, Balint Armin; Sun, Ming; Snyder, Phil; Daneault, Jean-Francois; Parisi, Federico; Costante, Gianluca; Rubin, Udi; Banda, Peter; Chae, Yooree; Chaibub Neto, Elias; Dorsey, E. Ray; Chen, Aipeng; Elo, Laura L.; Espino, Carlos; Glaab, Enrico; Goan, Ethan; Golabchi, Fatemeh Noushin; Gormez, Yasin; Jaakkola, Maria K.; Jonnagaddala, Jitendra; Klen, Riku; Li, Dongmei; McDaniel, Christian; Perrin, Dimitri; Perumal, Thanneer M.; Rad, Nastaran Mohammadian; Rainaldi, Erin; Sapienza, Stefano; Schwab, Patrick; Shokhirev, Nikolai; Venalainen, Mikko S.; Vergara-Diaz, Gloria; Zhang, Yuqian; Wang, Yuanjia; Guan, Yuanfang; Brunner, Daniela; Bonato, Paolo; Mangravite, Lara M.; Omberg, Larsson; AGÜ, Mühendislik Fakültesi, Elektrik - Elektronik Mühendisliği Bölümü; Aydin, ZaferConsumer wearables and sensors are a rich source of data about patients' daily disease and symptom burden, particularly in the case of movement disorders like Parkinson's disease (PD). However, interpreting these complex data into so-called digital biomarkers requires complicated analytical approaches, and validating these biomarkers requires sufficient data and unbiased evaluation methods. Here we describe the use of crowdsourcing to specifically evaluate and benchmark features derived from accelerometer and gyroscope data in two different datasets to predict the presence of PD and severity of three PD symptoms: tremor, dyskinesia, and bradykinesia. Forty teams from around the world submitted features, and achieved drastically improved predictive performance for PD status (best AUROC = 0.87), as well as tremor- (best AUPR = 0.75), dyskinesia- (best AUPR = 0.48) and bradykinesia-severity (best AUPR = 0.95).conferenceobject.listelement.badge Data Mining Techniques in Direct Marketing on Imbalanced Data using Tomek Link Combined with Random Under-sampling(Association for Computing Machinery, 2021) Ümit Yilmaz; Zafer Aydin; V. Çağri Güngör; Cengiz Gezer; AGÜ, Mühendislik Fakültesi, Bilgisayar Mühendisliği Bölümü; Yilmaz, Ümit; Aydin, Zafer; Güngör, V. ÇağriDetermining the potential customers is very important in direct marketing. Data mining techniques are one of the most important methods for companies to determine potential customers. However, since the number of potential customers is very low compared to the number of non-potential customers, there is a class imbalance problem that significantly affects the performance of data mining techniques. In this paper, different combinations of basic and advanced resampling techniques such as Synthetic Minority Oversampling Technique (SMOTE), Tomek Link, RUS, and ROS were evaluated to improve the performance of customer classification. Different feature selection techniques are used in order the decrease the number of non-informative features from the data such as Information Gain, Gain Ratio, Chi-squared, and Relief. Classification performance was compared and utilized using several data mining techniques, such as LightGBM, XGBoost, Gradient Boost, Random Forest, AdaBoost, ANN, Logistic Regression, Decision Trees, SVC, Bagging Classifier based on ROC AUC and sensitivity metrics. A combination of Tomek Link and Random Under-Sampling as a resampling technique and Chi-squared method as feature selection algorithm showed superior performance among the other combinations. Detailed performance evaluations demonstrated that with the proposed approach, LightGBM, which is a gradient boosting algorithm based on decision tree, gave the best results among the other classifiers with 0.947 sensitivity and 0.896 ROC AUC value. © 2021 ACM.Article A Deep Ensemble Approach for Long-Term Traffic Flow Prediction(SPRINGER, 2024) Cini, Nevin; Aydin, Zafer; 0000-0001-5348-4043; 0000-0001-7686-6298; AGÜ, Mühendislik Fakültesi, Bilgisayar Mühendisliği Bölümü; Cini, Nevin; Aydin, ZaferIn the last 50 years, with the growth of cities and increase in the number of vehicles and mobility, traffic has become troublesome. As a result, traffic flow prediction started to attract attention as an important research area. However, despite the extensive literature, traffic flow prediction still remains as an open research problem, specifically for long-term traffic flow prediction. Compared to the models developed for short-term traffic flow prediction, the number of models developed for long-term traffic flow prediction is very few. Based on this shortcoming, in this study, we focus on long-term traffic flow prediction and propose a novel deep ensemble model (DEM). In order to build this ensemble model, first, we developed a convolutional neural network (CNN), a long short-term memory (LSTM) network and a gated recurrent unit (GRU) network as deep learning models, which formed the base learners. In the next step, we combine the output of these models according to their individual forecasting success. We use another deep learning model to determine the success of the individual models. Our proposed model is a flexible ensemble prediction model that can be updated based on traffic data. To evaluate the performance of the proposed model, we use a publicly available dataset. Experimental results show that the developed DEM model has a mean square error of 0.06 and a mean absolute error of 0.15 for single-step prediction; it shows that achieves a mean square error of 0.25 and a mean absolute error of 0.32 for multi-step prediction. We compared our proposed model with many models in different categories; individual deep learning models (i.e., LSTM, CNN, GRU), selected traditional machine learning models (i.e., linear regression, decision tree regression, k-nearest-neighbors regression) and other ensemble models such as random-forest regression. These results also support the claim that ensemble learning models perform better than individual models.Article A deep learning approach with Bayesian optimization and ensemble classifiers for detecting denial of service attacks(WILEY, 111 RIVER ST, HOBOKEN 07030-5774, NJ USA, 2020) Gormez, Yasin; Aydin, Zafer; Karademir, Ramazan; Gungor, Vehbi C.; AGÜ, Mühendislik Fakültesi, Bilgisayar Mühendisliği BölümüDetecting malicious behavior is important for preventing security threats in a computer network. Denial of Service (DoS) is among the popular cyber attacks targeted at web sites of high-profile organizations and can potentially have high economic and time costs. In this paper, several machine learning methods including ensemble models and autoencoder-based deep learning classifiers are compared and tuned using Bayesian optimization. The autoencoder framework enables to extract new features by mapping the original input to a new space. The methods are trained and tested both for binary and multi-class classification on Digiturk and Labris datasets, which were introduced recently for detecting various types of DDoS attacks. The best performing methods are found to be ensembles though deep learning classifiers achieved comparable level of accuracy.conferenceobject.listelement.badge Design of a Tri band 5-Fingers Shaped Microstrip Patch Antenna with an Adjustable Resistor(IEEE, 2014) Aoad, Ashrf; Aydin, Zafer; Korkmaz, Erdal; 0000-0001-7686-6298; AGÜ, Mühendislik Fakültesi, Bilgisayar Mühendisliği Bölümü; Aydin, ZaferThis paper presents a tri band 5-fingers shaped microstrip patch antenna, which resonates initially at dual band of 3.2 GHz and 5.2 GHz frequencies for VSWR < 2. The antenna is modified by adding an adjustable resistor between the conductor and the reflecting plane giving a third resonant frequency of 2.4 GHz. A decrease in the return loss at 2.4 GHz is observed by modifying the value of the resistance. Impedance bandwidth and the resonant frequencies are examined with respect to the variability of the parameters of the antenna and the position of the adjustable resistor. The size of the antenna has been reduced by 57.9% in length and 14.06% in width. The proposed antenna can be used for 4G, WLAN, and Wi-MAX The antenna is designed and optimized by using the commercial CST software.Article The Determination of Distinctive Single Nucleotide Polymorphism Sets for the Diagnosis of Behçet's Disease(Institute of Electrical and Electronics Engineers Inc., 2021) Isik, Yunus EMRE; Gormez, Yasin; Aydin, Zafer; Bakir-Gungor, Burcu; AGÜ, Mühendislik Fakültesi, Bilgisayar Mühendisliği Bölümü; Aydin, Zafer; Burcu, Bakir-Gungor,Behçet's Disease (BD) is a multi-system inflammatory disorder in which the etiology remains unclear. The most probable hypothesis is that genetic tendency and environmental factors play roles in the development of BD. In order to find the essential reasons, genetic changes on thousands of genes should be analyzed. Besides, there is a need for extra analysis to find out which genetic factor affects the disease. Machine learning approaches have high potential for extracting the knowledge from genomics and selecting the representative Single Nucleotide Polymorphisms (SNPs) as the most effective features for the clinical diagnosis process. In this study, we have attempted to identify representative SNPs using feature selection methods, incorporating biological information and aimed to develop a machine-learning model for diagnosing Behçet's disease. By combining biological information and machine learning classifiers, up to 99.64% accuracy of disease prediction is achieved using only 13,611 out of 311,459 SNPs. In addition, we revealed the SNPs that are most distinctive by performing repeated feature selection in cross-validation experiments. IEEEArticle Developing structural profile matrices for protein secondary structure and solvent accessibility prediction(OXFORD UNIV PRESS, GREAT CLARENDON ST, OXFORD OX2 6DP, ENGLAND, 2019) Aydin, Zafer; Azginoglu, Nuh; Bilgin, Halil Ibrahim; Celik, Mete; 0000-0002-4074-7366; 0000-0002-1488-1502; AGÜ, Mühendislik Fakültesi, Bilgisayar Mühendisliği BölümüMotivation: Predicting secondary structure and solvent accessibility of proteins are among the essential steps that preclude more elaborate 3D structure prediction tasks. Incorporating class label information contained in templates with known structures has the potential to improve the accuracy of prediction methods. Building a structural profile matrix is one such technique that provides a distribution for class labels at each amino acid position of the target. Results: In this paper, a new structural profiling technique is proposed that is based on deriving PFAM families and is combined with an existing approach. Cross-validation experiments on two benchmark datasets and at various similarity intervals demonstrate that the proposed profiling strategy performs significantly better than Homolpro, a state-of-the-art method for incorporating template information, as assessed by statistical hypothesis tests.conferenceobject.listelement.badge Development of Knowledge Based Response Correction for a Reconfigurable N-Shaped Microstrip Antenna Design(IEEE, 2015) Aoad, Ashrf; Simsek, Murat; Aydin, Zafer; AGÜ, Mühendislik Fakültesi, Bilgisayar Mühendisliği Bölümü; Aydin, ZaferThis study presents the use of prior knowledge of inverse artificial neural network (ANN) to model and optimize a reconfigurable N-shaped microstrip antenna. Three accurate prior knowledge inverse ANNs with large amount training data are proposed where the frequency information is incorporated into the structure of ANN. The complexity of the input/output relationship is reduced by using prior knowledge. Three separate methods of incorporating knowledge in the second step of the training process with a multilayer perceptron (MEP) in the first step are demonstrated and their results are compared to EM simulation.Article Dimensionality reduction for protein secondary structure and solvent accesibility prediction(IMPERIAL COLLEGE PRESS, 57 SHELTON ST, COVENT GARDEN, LONDON WC2H 9HE, ENGLAND, 2018) Aydin, Zafer; Kaynar, Oguz; Gormez, Yasin; AGÜ, Mühendislik Fakültesi, Bilgisayar Mühendisliği BölümüSecondary structure and solvent accessibility prediction provide valuable information for estimating the three dimensional structure of a protein. As new feature extraction methods are developed the dimensionality of the input feature space increases steadily. Reducing the number of dimensions provides several advantages such as faster model training, faster prediction and noise elimination. In this work, several dimensionality reduction techniques have been employed including various feature selection methods, autoencoders and PCA for protein secondary structure and solvent accessibility prediction. The reduced feature set is used to train a support vector machine at the second stage of a hybrid classifier. Cross-validation experiments on two difficult benchmarks demonstrate that the dimension of the input space can be reduced substantially while maintaining the prediction accuracy. This will enable the incorporation of additional informative features derived for predicting the structural properties of proteins without reducing the accuracy due to overfitting.Article Effect of interpolation on specular reflections in texture-based automatic colonic polyp detection(WILEY, 111 RIVER ST, HOBOKEN 07030-5774, NJ USA, 2020) Kacmaz, Rukiye Nur; Yilmaz, Bulent; Aydin, Zafer; 0000-0002-3237-9997; AGÜ, Mühendislik Fakültesi, Elektrik - Elektronik Mühendisliği BölümüReflections of LED light cause unwanted noise effects called specular reflection (SR) on colonoscopic images. The aim of this study was to seek answers to the following two questions. (a) How are the texture features used in automatic detection of polyps affected by the interpolation on specular reflections? (b) If they are affected does it really affect the classification performance? In order to answer these questions, we used 610 colonoscopy images, and divided each image into tiles whose sizes were 32-by-32 pixels. From these tiles, we selected the ones without any specular reflection. We added different shape and size specular reflections cropped from real images onto the reflection-free tiles. We then used the nearest neighbors, bilinear and bicubic interpolation techniques on the tiles on which SRs were added. On these tiles we extracted 116 texture features using 3 second-order approaches, and 4 first-order statistics. First, we used paired samplettest. Second, we performed automatic classification of polyps and background using random forest and k nearest neighbors (k-NN) approaches using the texture features for different combinations of specular reflections added on the tiles from the polyp or background. The results showed that depending on the size of specular reflection, interpolation can cause a significant difference between the texture features that were coming from reflection-free tiles and the same tiles on which interpolation was performed. In addition, we note that bicubic interpolation may be preferred to eliminate specular reflection when texture features are used for background and polyp discrimination.Article An effective colorectal polyp classification for histopathological images based on supervised contrastive learning(ELSEVIER, 2024) Yengec-Tasdemir, Sena Busra; Aydin, Zafer; Akay, Ebru; Dogan, Serkan; Yilmaz, Bulent; 0000-0001-7686-6298; 0000-0003-2954-1217; AGÜ, Mühendislik Fakültesi, Bilgisayar Mühendisliği Bölümü; Aydin, Zafer; Yilmaz, BulentEarly detection of colon adenomatous polyps is pivotal in reducing colon cancer risk. In this context, accurately distinguishing between adenomatous polyp subtypes, especially tubular and tubulovillous, from hyperplastic variants is crucial. This study introduces a cutting-edge computer-aided diagnosis system optimized for this task. Our system employs advanced Supervised Contrastive learning to ensure precise classification of colon histopathology images. Significantly, we have integrated the Big Transfer model, which has gained prominence for its exemplary adaptability to visual tasks in medical imaging. Our novel approach discerns between in-class and out-of-class images, thereby elevating its discriminatory power for polyp subtypes. We validated our system using two datasets: a specially curated one and the publicly accessible UniToPatho dataset. The results reveal that our model markedly surpasses traditional deep convolutional neural networks, registering classification accuracies of 87.1% and 70.3% for the custom and UniToPatho datasets, respectively. Such results emphasize the transformative potential of our model in polyp classification endeavors.conferenceobject.listelement.badge Feature Selection for Protein Dihedral Angle Prediction(IEEE345 E 47TH ST, NEW YORK, NY 10017 USA, 2017) Aydin, Zafer; Kaynar, Oguz; Gormez, Yasin; AGÜ, Mühendislik Fakültesi, Bilgisayar Mühendisliği BölümüThree-dimensional structure prediction has crucial importance for bioinformatics and theoretical chemistry. One of the main steps of three-dimensional structure prediction is dihedral (torsion) angle prediction. As new feature extraction methods are developed the dimension of the input space increases considerably yielding longer model training and less accurate models due to noisy or redundant features. In this study, feature selection is employed for dimensionality reduction on one of the established benchmarks of protein 1D structure prediction. Experimental results show that the feature selection improves the accuracy of protein dihedral angle class prediction by 2% and can eliminate up to %82 of the features when random forest classifier is used. Accurate prediction of dihedral angles will eventually contribute to protein structure prediction.