Scopus İndeksli Yayınlar Koleksiyonu

Permanent URI for this collectionhttps://hdl.handle.net/20.500.12573/395

Browse

Search Results

Now showing 1 - 10 of 17
  • Conference Object
    Citation - Scopus: 1
    The Identification of Discriminative Single Nucleotide Polymorphism Sets for the Classification of Behçet's Disease
    (Institute of Electrical and Electronics Engineers Inc., 2018-09) Görmez, Yasin; Işik, Yunus Emre; Bakir-Güngör, Burcu
    Behçet's disease is a long-term multisystem inflammatory disorder, characterized by recurrent attacks affecting several organs. As the genotyping individuals get cheaper and easier following the developments in genomic technologies, genome-wide association studies (GWAS) emerged. By this means, via studying big-sized case-control groups for a specific disease, potential genetic variations, single nucleotide polymorphisms (SNPs) are identified. Although several genetic risk factors are identified for Behçet's disease with the help of these studies via scanning around a million of SNPs, these variations could only explain up to 20% of the disease's genetic risk. In this study, for Behçet's disease classification, via comparing all the SNPs genotyped in GWAS, with the SNPs selected via using genetic knowledge, gain ratio and information gain; both reduction in the feature size and improvement in the classification accuracy is aimed. Also, using different classification algorithms such as random forest, k-nearest neighbour and logistic regression, their effects on the classification accuracy are investigated. Our results showed that compared to other feature selection methods, with at least 81% success rate, the selection of the SNPs using the genetic information (of their GWAS p-values, indicating the significance of the SNP against the disease) provides 15% to 42% improvement in all classification algorithms. This improvement is statistically sound. While gain ratio and information gain feature selection techniques yield similar classification accuracies, the models using all SNPs could not exceed 50% accuracies and results in the worst performance. © 2019 Elsevier B.V., All rights reserved.
  • Article
    Citation - Scopus: 6
    Network Intrusion Detection Based on Machine Learning Strategies: Performance Comparisons on Imbalanced Wired, Wireless, and Software-Defined Networking (SDN) Network Traffics
    (Turkiye Klinikleri, 2024-07-26) Hacilar, Hilal; Aydin, Zafer; Güngör, Vehbi Çağrı
    The rapid growth of computer networks emphasizes the urgency of addressing security issues. Organizations rely on network intrusion detection systems (NIDSs) to protect sensitive data from unauthorized access and theft. These systems analyze network traffic to detect suspicious activities, such as attempted breaches or cyberattacks. However, existing studies lack a thorough assessment of class imbalances and classification performance for different types of network intrusions: wired, wireless, and software-defined networking (SDN). This research aims to fill this gap by examining these networks’ imbalances, feature selection, and binary classification to enhance intrusion detection system efficiency. Various techniques such as SMOTE, ROS, ADASYN, and SMOTETomek are used to handle imbalanced datasets. Additionally, eXtreme Gradient Boosting (XGBoost) identifies key features, and an autoencoder (AE) assists in feature extraction for the classification task. The study evaluates datasets such as AWID, UNSW, and InSDN, yielding the best results with different numbers of selected features. Bayesian optimization fine-tunes parameters, and diverse machine learning algorithms (SVM, kNN, XGBoost, random forest, ensemble classifiers, and autoencoders) are employed. The optimal results, considering F1-measure, overall accuracy, detection rate, and false alarm rate, have been achieved for the UNSW-NB15, preprocessed AWID, and InSDN datasets, with values of [0.9356, 0.9289, 0.9328, 0.07597], [0.997, 0.9995, 0.9999, 0.0171], and [0.9998, 0.9996, 0.9998, 0.0012], respectively. These findings demonstrate that combining Bayesian optimization with oversampling techniques significantly enhances classification performance across wired, wireless, and SDN networks when compared to previous research conducted on these datasets. © 2024 Elsevier B.V., All rights reserved.
  • Conference Object
    Citation - Scopus: 5
    Identifying Taxonomic Biomarkers of Colorectal Cancer in Human Intestinal Microbiota Using Multiple Feature Selection Methods
    (Institute of Electrical and Electronics Engineers Inc., 2022-09-07) Jabeer, Amhar; Kocak, Aysegul; Akkaş, Huseyin; Yenisert, Ferhan; Nalbantoĝlu, Özkan Ufuk; Yousef, Malik; Bakir-Güngör, Burcu; Bakir Gungor, Burcu
    A variety of bacterial species called gut microbiota work together to maintain a steady intestinal environment. The gastrointestinal tract contains tremendous amount of different species including archaea, bacteria, fungi, and viruses. While these organisms are crucial immune system stabilizers, the dysbiosis of the intestinal flora has been related to gastrointestinal disorders including Colorectal cancer (CRC), intestinal cancer, irritable bowel syndrome and inflammatory bowel disease. In the last decade, next-generation sequencing (NGS) methods have accelerated the identification of human gut flora. CRC is a deathly condition that has been on the rise in the last century, affecting half a million people each year. Since early CRC diagnosis is critical for an effective treatment, there is an immediate requirement for a classification system that can expedite CRC diagnosis. In this study, via analyzing the available metagenomics data on CRC, we aim to facilitate the CRC diagnosis via finding biomarkers linked with CRC, and via building a classification model. We have obtained the metagenomic sequencing data of the healthy individuals and CRC patients from a metagenome-wide association analysis and we have classified this data according to the disease stages. Conditional Mutual Information Maximization (CMIM), Fast Correlation Based Filter (FCBF), Extreme Gradient Boosting (XGBoost), min redundancy max relevance (mRMR), Information Gain (IG) and Select K Best (SKB) feature selection algorithms were utilized to cope with the complexity of the features. We observed that the SKB, IG, and XGBoost techniques made significant contributions to decrease the microbiota in use for CRC diagnosis, thereby reducing cost and time. We realized that our Random Forest classifier outperformed Adaboost, Support Vector Machine, Decision Tree, Logitboost and stacking ensemble classifiers in terms of CRC classification performance. Our results reiterated some known and some potential microbiome associated mechanisms in CRC, which could aid the design of new diagnostics based on the microbiome. © 2022 Elsevier B.V., All rights reserved.
  • Article
    Citation - Scopus: 8
    Building a Challenging Medical Dataset for Comparative Evaluation of Classifier Capabilities
    (Elsevier Ltd, 2024-08) Bozkurt, Berat; Coskun, Kerem; Bakal, Gokhan
    Since the 2000s, digitalization has been a crucial transformation in our lives. Nevertheless, digitalization brings a bulk of unstructured textual data to be processed, including articles, clinical records, web pages, and shared social media posts. As a critical analysis, the classification task classifies the given textual entities into correct categories. Categorizing documents from different domains is straightforward since the instances are unlikely to contain similar contexts. However, document classification in a single domain is more complicated due to sharing the same context. Thus, we aim to classify medical articles about four common cancer types (Leukemia, Non-Hodgkin Lymphoma, Bladder Cancer, and Thyroid Cancer) by constructing machine learning and deep learning models. We used 383,914 medical articles about four common cancer types collected by the PubMed API. To build classification models, we split the dataset into 70% as training, 20% as testing, and 10% as validation. We built widely used machine-learning (Logistic Regression, XGBoost, CatBoost, and Random Forest Classifiers) and modern deep-learning (convolutional neural networks - CNN, long short-term memory - LSTM, and gated recurrent unit - GRU) models. We computed the average classification performances (precision, recall, F-score) to evaluate the models over ten distinct dataset splits. The best-performing deep learning model(s) yielded a superior F1 score of 98%. However, traditional machine learning models also achieved reasonably high F1 scores, 95% for the worst-performing case. Ultimately, we constructed multiple models to classify articles, which compose a hard-to-classify dataset in the medical domain. © 2024 Elsevier B.V., All rights reserved.
  • Conference Object
    Benchmarking CNN Architectures for Eye Disease Detection With Transfer Learning Techniques
    (Institute of Electrical and Electronics Engineers Inc., 2025-06-27) Keles, Tolgahan; Aykanat, Muhammet Ali; Kurban, Rifat
    In this study, convolutional neural networks (CNN)-based approaches were compared to classify eye diseases using transfer learning techniques. A series of data augmentation strategies, including random rotation, shifting, shearing, zooming, and horizontal flipping, were applied to increase the training data's robustness and diversity. Several state-of-the-art CNNs, including ResNet50, VGG19, EfficientNetB0, Xception, InceptionV3, DenseNet121, MobileNetV2, NASNetMobile, and ConvNeXtBase, were fine-tuned through transfer learning. During training, models were evaluated based on their accuracy, training time, and validation performance, while early stopping mechanisms were employed to prevent overfitting. Experimental results demonstrated that DenseNet121 achieved the highest validation accuracy (72%) during the training phase and the best test set performance with an accuracy of 68% and an AUC-ROC of 0.93. MobileNetV2, on the other hand, provided a strong balance between classification accuracy (65%) and low inference time (7.28 ms), making it appropriate for real-time uses. The findings highlight the importance of selecting appropriate architectures by considering both predictive performance and computational efficiency, particularly in the context of medical imaging, where real-world deployment constraints are critical. © 2025 Elsevier B.V., All rights reserved.
  • Article
    Citation - Scopus: 15
    An Effective Colorectal Polyp Classification for Histopathological Images Based on Supervised Contrastive Learning
    (Elsevier Ltd, 2024-04) Yengec-Tasdemir, Sena Busra; Aydin, Zafer; Akay, Ebru; Doǧan, Serkan; Yilmaz, Bulent
    Early detection of colon adenomatous polyps is pivotal in reducing colon cancer risk. In this context, accurately distinguishing between adenomatous polyp subtypes, especially tubular and tubulovillous, from hyperplastic variants is crucial. This study introduces a cutting-edge computer-aided diagnosis system optimized for this task. Our system employs advanced Supervised Contrastive learning to ensure precise classification of colon histopathology images. Significantly, we have integrated the Big Transfer model, which has gained prominence for its exemplary adaptability to visual tasks in medical imaging. Our novel approach discerns between in-class and out-of-class images, thereby elevating its discriminatory power for polyp subtypes. We validated our system using two datasets: a specially curated one and the publicly accessible UniToPatho dataset. The results reveal that our model markedly surpasses traditional deep convolutional neural networks, registering classification accuracies of 87.1% and 70.3% for the custom and UniToPatho datasets, respectively. Such results emphasize the transformative potential of our model in polyp classification endeavors. © 2024 Elsevier B.V., All rights reserved.
  • Conference Object
    Citation - Scopus: 19
    A Novel Feature Design and Stacking Approach for Non-Technical Electricity Loss Detection
    (Institute of Electrical and Electronics Engineers Inc., 2018-05) Aydin, Zafer; Güngör, Vehbi Çağrı
    Non-technical electricity losses continue to jeopardize economic and social well-being of many countries. In this work, we develop machine learning classifiers that can identify anomalous electricity consumption in Turkey. Starting from weekly electricity usage data, we develop new features that capture statistical and frequency domain characteristics of the customers and their consumption patterns. We analyze the effect of reducing number of feature descriptors through dimensionality reduction and feature selection techniques. To overcome the class imbalance problem, we implement several ensemble methods and compare their prediction accuracy to those of the standard classifiers. The proposed features and combining strengths of different classifiers bring significant improvements on performance metrics, which is demonstrated through detailed simulations on shopping mall sector. We anticipate that advances in this field will contribute to the economies considerably. © 2018 Elsevier B.V., All rights reserved.
  • Conference Object
    Citation - Scopus: 3
    A Hybrid Adaptive Neuro-Fuzzy Inference System (ANFIS) Approach for Professional Bloggers Classification
    (Institute of Electrical and Electronics Engineers Inc., 2019-11) Asim, Yousra; Raza, Basit; Malik, Ahmad Kamran Kamran; Shahid, Ahmad Raza; Faheem, Muhammed Yasir; Kumar, Y. J.
    Despite their small numbers, some users of the online social networks demonstrate the ability to influence others. Bloggers are one of such kind of users that through their ideas and opinions on different topics, influence other users. Their identification may be beneficial for several purposes, such as online marketing for products. Much effort has been expanded towards finding the impact of such bloggers within the blogging community. We have expanded on their work by identifying influential bloggers using labeled data. We have improved upon the accuracy of the classification of professional and nonprofessional bloggers. We have made use of Adaptive Neuro-Fuzzy Inference System (ANFIS), and the Fuzzy Inference System (FIS) models. Their performance has been gauged and compared with the existing techniques and approaches, such as an Artificial Neural Network (ANN), Alternating Decision Tree (ADTree) algorithm, and Classification Based on Associations (CBA) algorithm. Adaptive techniques (ANFIS and ANN) are found better than the aforementioned rule-based classifiers. The FIS model outperformed the CBA algorithm, but showed similar performance to the ADTree algorithm. Our proposed ANFIS model showed improved results in terms of performance measures with 93% accuracy for blogger classification. © 2020 Elsevier B.V., All rights reserved.
  • Conference Object
    Citation - WoS: 3
    Citation - Scopus: 13
    NSEM: Duygu Analizi için Özgün Yıǧınlanmiş Topluluk Yöntemi
    (Institute of Electrical and Electronics Engineers Inc., 2018-09) Işik, Yunus Emre; Görmez, Yasin; Kaynar, Oǧuz; Aydin, Zafer; Emre Isik, Yunus
    Today, people often share their ideas, opinions and feelings through forums, social media sites, blogs and similar platforms. For this reason, access to these data has become very easy. Increase in the number of shares makes it possible to analyze and use these data in terms of marketing and politics. However, due to the large number of data, it is impossible that this analysis will be done by humans. Determination of what type of emotion is included automatically is done by sentiment analysis methods. In these methods, the text is defined as a mathematical vector and classified by machine learning methods. Ensemble methods are one of the most important methods used as classifiers in sentiment analysis. In these methods, a classifier error is tried to be solved by another classifier. In sentiment analysis, the feature vector that describes the text is as important as the classifier. Feature vectors obtained using different methods can make mistakes in different places. For this reason, in this study, NSEM is proposed for sentiment analysis, which is a new ensemble method that uses 2 different classifiers and 2 different feature extraction methods. As a result of the analysis, the proposed method is the most successful method with an accuracy rate of 79.1%. © 2019 Elsevier B.V., All rights reserved.
  • Conference Object
    Population Specific Classification of Colorectal Cancer With Meta-Analysis of Metagenomic Data
    (Institute of Electrical and Electronics Engineers Inc., 2023-10-11) Temiz, Mustafa; Yousef, Malik; Bakir-Güngör, Burcu
    Advances in next-generation sequencing and '-omics' technologies makes it possible to characterize the human gut microbiome. While some of these microorganisms are important regulators of our immune system, modulation of the microbiota leads to a variety of diseases. Colorectal cancer (CRC), the third most common cancer worldwide, is caused by genetic mutations, environmental conditions, and abnormalities in the gut microbiota. Using various machine learning methods and meta-analysis techniques, this study aims to build a classification model that can help in CRC diagnosis by analyzing metagenomic datasets of different populations obtained at the species level. Using 8 different countries and 9 different metagenomic datasets, 3 different meta-analyzes are performed: within-population, cross-population, and one population is selected for testing and the rest is used as a training dataset (LODO). For CRC classification, 4 different classification algorithms (Random Forest (RF), Logitboost, Adaboost, and Decision Tree (DT)) are used. The best performance among these methods was obtained with the Random Forest algorithm with an AUC of 0.98 by using JP for the training data set and JPN populations for the test data set in the cross-population performance evaluation. © 2023 Elsevier B.V., All rights reserved.