Scopus İndeksli Yayınlar Koleksiyonu

Permanent URI for this collectionhttps://hdl.handle.net/20.500.12573/395

Browse

Search Results

Now showing 1 - 10 of 33
  • Article
    Citation - WoS: 6
    Citation - Scopus: 7
    The Determination of Distinctive Single Nucleotide Polymorphism Sets for the Diagnosis of Behcet's Disease
    (IEEE Computer Soc, 2022-05-01) Isik, Yunus Emre; Gormez, Yasin; Aydin, Zafer; Bakir-Gungor, Burcu
    Behcet's Disease (BD) is a multi-system inflammatory disorder in which the etiology remains unclear. The most probable hypothesis is that genetic tendency and environmental factors play roles in the development of BD. In order to find the essential reasons, genetic changes on thousands of genes should be analyzed. Besides, there is a need for extra analysis to find out which genetic factor affects the disease. Machine learning approaches have high potential for extracting the knowledge from genomics and selecting the representative Single Nucleotide Polymorphisms (SNPs) as the most effective features for the clinical diagnosis process. In this study, we have attempted to identify representative SNPs using feature selection methods, incorporating biological information and aimed to develop a machine-learning model for diagnosing Behcet's disease. By combining biological information and machine learning classifiers, up to 99.64 percent accuracy of disease prediction is achieved using only 13,611 out of 311,459 SNPs. In addition, we revealed the SNPs that are most distinctive by performing repeated feature selection in cross-validation experiments.
  • Conference Object
    Citation - WoS: 3
    Citation - Scopus: 3
    Template Scoring Methods for Protein Torsion Angle Prediction
    (Springer-Verlag Berlin, 2015) Aydin, Zafer; Baker, David; Noble, William Stafford
    Prediction of backbone torsion angles provides important constraints about the 3D structure of a protein and is receiving a growing interest in the structure prediction community. In this paper, we introduce a three-stage machine learning classifier to predict the 7-state torsion angles of a protein. The first two stages employ dynamic Bayesian and neural networks to produce an ab-initio prediction of torsion angle states starting from sequence profiles. The third stage is a committee classifier, which combines the ab-initio prediction with a structural frequency profile derived from templates obtained by HHsearch. We develop several structural profile models and obtain significant improvements over the Laplacian scoring technique through: (1) scaling templates by integer powers of sequence identity score, (2) incorporating other alignment scores as multiplicative factors (3) adjusting or optimizing parameters of the profile models with respect to the similarity interval of the target. We also demonstrate that the torsion angle prediction accuracy improves at all levels of target-template similarity even when templates are distant from the target. The improvement is at significantly higher rates as template structures gradually get closer to target.
  • Article
    Citation - WoS: 2
    Citation - Scopus: 2
    Structural Profile Matrices for Predicting Structural Properties of Proteins
    (World Scientific Publ Co Pte Ltd, 2020-07-10) Azginoglu, Nuh; Aydin, Zafer; Celik, Mete
    Predicting structural properties of proteins plays a key role in predicting the 3D structure of proteins. In this study, new structural profile matrices (SPM) are developed for protein secondary structure, solvent accessibility and torsion angle class predictions, which could be used as input to 3D prediction algorithms. The structural templates employed in computing SPMs are detected by eight alignment methods in LOMETS server, gap affine alignment method, ScanProsite, PfamScan, and HHblits. The contribution of each template is weighted by its similarity to target, which is assessed by several sequence alignment scores. For comparison, the SPMs are also computed using Homolpro, which uses BLAST for target template alignments and does not assign weights to templates. Incorporating the SPMs into DSPRED classifier, the prediction accuracy improves significantly as demonstrated by cross-validation experiments on two difficult benchmarks. The most accurate predictions are obtained using the SPMs derived by threading methods in LOMETS server. On the other hand, the computational cost of computing these SPMs was the highest.
  • Conference Object
    Citation - WoS: 1
    Citation - Scopus: 8
    Short Term Electricity Load Forecasting: A Case Study of Electric Utility Market in Turkey
    (Institute of Electrical and Electronics Engineers Inc., 2015-04) Ishik, Muhammed Yasin; Göze, Tolga; Ozcan, Ihsan; Güngör, Vehbi Çağrı; Aydin, Zafer; Yasin, Muhammed
    With the recent developments in energy sector, the pricing of electricity is now governed by the spot market where a variety of market mechanisms are effective. After the new legislation of market liberalization in Turkey, competition-based on hourly price has received a growing interest in the energy market, which necessitated generators and electric utility companies to add new dimensions to their scope of operation: short-term load and price forecasting. The field has several opportunities though not free from challenges. The dynamic behavior of the market price has caused the electric load to become variable and non-stationary. Furthermore, the number of nodes, in which the load must be predicted, is not constant anymore and can no longer be estimated by experts alone. In this competitive scenario, statistical forecasting methods that can automatically and accurately process thousands of data samples are essential. The purpose of this study is to demonstrate the importance of short-term load forecasting, how it has received a growing interest in Turkey and to propose an artificial neural network that can forecast the short term electricity load. Through detailed performance evaluations, we demonstrate that our forecasting method is capable of predicting the hourly load accurately. © 2017 Elsevier B.V., All rights reserved.
  • Article
    Citation - WoS: 4
    Citation - Scopus: 4
    Sample Reduction Strategies for Protein Secondary Structure Prediction
    (MDPI, 2019-10-18) Atasever, Sema; Aydin, Zafer; Erbay, Hasan; Sabzekar, Mostafa
    Predicting the secondary structure from protein sequence plays a crucial role in estimating the 3D structure, which has applications in drug design and in understanding the function of proteins. As new genes and proteins are discovered, the large size of the protein databases and datasets that can be used for training prediction models grows considerably. A two-stage hybrid classifier, which employs dynamic Bayesian networks and a support vector machine (SVM) has been shown to provide state-of-the-art prediction accuracy for protein secondary structure prediction. However, SVM is not efficient for large datasets due to the quadratic optimization involved in model training. In this paper, two techniques are implemented on CB513 benchmark for reducing the number of samples in the train set of the SVM. The first method randomly selects a fraction of data samples from the train set using a stratified selection strategy. This approach can remove approximately 50% of the data samples from the train set and reduce the model training time by 73.38% on average without decreasing the prediction accuracy significantly. The second method clusters the data samples by a hierarchical clustering algorithm and replaces the train set samples with nearest neighbors of the cluster centers in order to improve the training time. To cluster the feature vectors, the hierarchical clustering method is implemented, for which the number of clusters and the number of nearest neighbors are optimized as hyper-parameters by computing the prediction accuracy on validation sets. It is found that clustering can reduce the size of the train set by 26% without reducing the prediction accuracy. Among the clustering techniques Ward's method provided the best accuracy on test data.
  • Conference Object
    Citation - WoS: 11
    Citation - Scopus: 20
    ROI Detection in Mammogram Images Using Wavelet-Based Haralick and Hog Features
    (IEEE, 2018-12) Tasdemir, Sena Busra Yengec; Tasdemir, Kasim; Aydin, Zafer; Yengec Tasdemir, Sena Busra
    Digital mammography is a widespread medical imaging technique that is used for early detection and diagnosis of breast cancer. Detecting the region of interest (ROI) helps to locate the abnormal areas, which may be analyzed further by a radiologist or a CAD system. In this paper, a new classification method is proposed for ROI detection in mammography images. Features are extracted using Wavelet transform, Haralick and HOG descriptors. To reduce the number of dimensions and eliminate irrelevant features, a wrapper-based feature selection method is implemented. Several feature extraction methods and machine learning classifiers are compared by performing a leave-one-image-out cross-validation experiment on a difficult dataset. The proposed feature extraction method provides the best accuracy of 87.5% and the second-best area under curve (AUC) score of 84% when employed in a random forest classifier.
  • Article
    Citation - WoS: 7
    Citation - Scopus: 11
    Protein Β-Sheet Prediction Using an Efficient Dynamic Programming Algorithm
    (Elsevier Sci Ltd, 2017-10) Sabzekar, Mostafa; Naghibzadeh, Mahmoud; Eghdami, Mandie; Aydin, Zafer
    Predicting the beta-sheet structure of a protein is one of the most important intermediate steps towards the identification of its tertiary structure. However, it is regarded as the primary bottleneck due to the presence of non-local interactions between several discontinuous regions in beta-sheets. To achieve reliable long-range interactions, a promising approach is to enumerate and rank all beta-sheet conformations for a given protein and find the one with the highest score. The problem with this solution is that the search space of the problem grows exponentially with respect to the number of beta-strands. Additionally, brute force calculation in this conformational space leads to dealing with a combinatorial explosion problem with intractable computational complexity. The main contribution of this paper is to generate and search the space of the problem efficiently to reduce the time complexity of the problem. To achieve this, two tree structures, called sheet-tree and grouping-tree, are proposed. They model the search space by breaking it into sub-problems. Then, an advanced dynamic programming is proposed that stores the intermediate results, avoids repetitive calculation by repeatedly uses them efficiently in successive steps and reduces the space of the problem by removing those intermediate results that will no longer be required in later steps. As a consequence, the following contributions have been made. Firstly, more accurate beta-sheet structures are found by searching all possible conformations, and secondly, the time complexity of the problem is reduced by searching the space of the problem efficiently which makes the proposed method applicable to predict beta-sheet structures with high number of beta-strands. Experimental results on the BetaSheet916 dataset showed significant improvements of the proposed method in both execution time and the prediction accuracy in comparison with the state-of-the-art beta-sheet structure prediction methods Moreover, we investigate the effect of different contact map predictors on the performance of the proposed method using BetaSheet1452 dataset. The source code is available at http://www.conceptsgate.com/BetaTop.rar. (C) 2017 Elsevier Ltd. All rights reserved.
  • Article
    Citation - Scopus: 6
    Network Intrusion Detection Based on Machine Learning Strategies: Performance Comparisons on Imbalanced Wired, Wireless, and Software-Defined Networking (SDN) Network Traffics
    (Turkiye Klinikleri, 2024-07-26) Hacilar, Hilal; Aydin, Zafer; Güngör, Vehbi Çağrı
    The rapid growth of computer networks emphasizes the urgency of addressing security issues. Organizations rely on network intrusion detection systems (NIDSs) to protect sensitive data from unauthorized access and theft. These systems analyze network traffic to detect suspicious activities, such as attempted breaches or cyberattacks. However, existing studies lack a thorough assessment of class imbalances and classification performance for different types of network intrusions: wired, wireless, and software-defined networking (SDN). This research aims to fill this gap by examining these networks’ imbalances, feature selection, and binary classification to enhance intrusion detection system efficiency. Various techniques such as SMOTE, ROS, ADASYN, and SMOTETomek are used to handle imbalanced datasets. Additionally, eXtreme Gradient Boosting (XGBoost) identifies key features, and an autoencoder (AE) assists in feature extraction for the classification task. The study evaluates datasets such as AWID, UNSW, and InSDN, yielding the best results with different numbers of selected features. Bayesian optimization fine-tunes parameters, and diverse machine learning algorithms (SVM, kNN, XGBoost, random forest, ensemble classifiers, and autoencoders) are employed. The optimal results, considering F1-measure, overall accuracy, detection rate, and false alarm rate, have been achieved for the UNSW-NB15, preprocessed AWID, and InSDN datasets, with values of [0.9356, 0.9289, 0.9328, 0.07597], [0.997, 0.9995, 0.9999, 0.0171], and [0.9998, 0.9996, 0.9998, 0.0012], respectively. These findings demonstrate that combining Bayesian optimization with oversampling techniques significantly enhances classification performance across wired, wireless, and SDN networks when compared to previous research conducted on these datasets. © 2024 Elsevier B.V., All rights reserved.
  • Article
    Citation - WoS: 26
    Citation - Scopus: 48
    Metabolic Imaging Based Sub-Classification of Lung Cancer
    (IEEE-Inst Electrical Electronics Engineers Inc, 2020) Bicakci, Mustafa; Ayyildiz, Oguzhan; Aydin, Zafer; Basturk, Alper; Karacavus, Seyhan; Yilmaz, Bulent
    Lung cancer is one of the deadliest cancer types whose 84% is non-small cell lung cancer (NSCLC). In this study, deep learning-based classification methods were investigated comprehensively to differentiate two subtypes of NSCLC, namely adenocarcinoma (ADC) and squamous cell carcinoma (SqCC). The study used 1457 F-18-FDG PET images/slices with tumor from 94 patients (88 men), 38 of which were ADC and the rest were SqCC. Three experiments were carried out to examine the contribution of peritumoral areas in PET images on subtype classification of tumors. We assessed multilayer perceptron (MLP) and three convolutional neural network (CNN) models such as SqueezeNet, VGG16 and VGG19 using three kinds of images in these experiments: 1) Whole slices without cropping or segmentation, 2) cropped image portions (square subimages) that include the tumor and 3) segmented image portions corresponding to tumors using random walk method. Several optimizers and regularization methods were used to optimize each model for the diagnostic classification. The classification models were trained and evaluated by performing stratified 10-fold cross validation, and F-score and area-under-curve (AUC) metrics were used to quantify the performance. According to our results, it is possible to say that inclusion of peritumoral regions/tissues both contributes to the success of models and makes segmentation effort unnecessary. To the best of our knowledge, deep learning-based models have not been applied to the subtype classification of NSCLC in PET imaging, therefore, this study is a significant cornerstone providing thorough comparisons and evaluations of several deep learning models on metabolic imaging for lung cancer. Even simpler deep learning models are found promising in this domain, indicating that any improvement in deep learning models in machine learning community can be reflected well in this domain as well.
  • Article
    Citation - WoS: 8
    Citation - Scopus: 10
    Lung Cancer Subtype Differentiation From Positron Emission Tomography Images
    (Tubitak Scientific & Technological Research Council Turkey, 2020-01-27) Ayyildiz, Oguzhan; Aydin, Zafer; Yilmaz, Bulent; Karacavus, Seyhan; Senkaya, Kubra; Icer, Semra; Kaya, Eser; Taşdemir, Arzu
    Lung cancer is one of the deadly cancer types, and almost 85% of lung cancers are nonsmall cell lung cancer (NSCLC). In the present study we investigated classification and feature selection methods for the differentiation of two subtypes of NSCLC, namely adenocarcinoma (ADC) and squamous cell carcinoma (SqCC). The major advances in understanding the effects of therapy agents suggest that future targeted therapies will be increasingly subtype specific. We obtained positron emission tomography (PET) images of 93 patients with NSCLC, 39 of which had ADC while the rest had SqCC. Random walk segmentation was applied to delineate three-dimensional tumor volume, and 39 texture features were extracted to grade the tumor subtypes. We examined 11 classifiers with two different feature selection methods and the effect of normalization on accuracy. The classifiers we used were the k-nearest-neighbor, logistic regression, support vector machine, Bayesian network, decision tree, radial basis function network, random forest, AdaBoostM1, and three stacking methods. To evaluate the prediction accuracy we performed a leave-one-out cross-validation experiment on the dataset. We also considered optimizing certain hyperparameters of these models by performing 10-fold cross-validation separately on each training set. We found that the stacking ensemble classifier, which combines a decision tree, AdaBoostM1, and logistic regression methods by a metalearner, was the most accurate method for detecting subtypes of NSCLC, and normalization of feature sets improved the accuracy of the classification method.