Scopus İndeksli Yayınlar Koleksiyonu

Permanent URI for this collectionhttps://hdl.handle.net/20.500.12573/395

Browse

Search Results

Now showing 1 - 10 of 10

Citation - Scopus: 1
The Identification of Discriminative Single Nucleotide Polymorphism Sets for the Classification of Behçet's Disease
(Institute of Electrical and Electronics Engineers Inc., 2018-09) Görmez, Yasin; Işik, Yunus Emre; Bakir-Güngör, Burcu
Behçet's disease is a long-term multisystem inflammatory disorder, characterized by recurrent attacks affecting several organs. As the genotyping individuals get cheaper and easier following the developments in genomic technologies, genome-wide association studies (GWAS) emerged. By this means, via studying big-sized case-control groups for a specific disease, potential genetic variations, single nucleotide polymorphisms (SNPs) are identified. Although several genetic risk factors are identified for Behçet's disease with the help of these studies via scanning around a million of SNPs, these variations could only explain up to 20% of the disease's genetic risk. In this study, for Behçet's disease classification, via comparing all the SNPs genotyped in GWAS, with the SNPs selected via using genetic knowledge, gain ratio and information gain; both reduction in the feature size and improvement in the classification accuracy is aimed. Also, using different classification algorithms such as random forest, k-nearest neighbour and logistic regression, their effects on the classification accuracy are investigated. Our results showed that compared to other feature selection methods, with at least 81% success rate, the selection of the SNPs using the genetic information (of their GWAS p-values, indicating the significance of the SNP against the disease) provides 15% to 42% improvement in all classification algorithms. This improvement is statistically sound. While gain ratio and information gain feature selection techniques yield similar classification accuracies, the models using all SNPs could not exceed 50% accuracies and results in the worst performance. © 2019 Elsevier B.V., All rights reserved.
Citation - Scopus: 6
Network Intrusion Detection Based on Machine Learning Strategies: Performance Comparisons on Imbalanced Wired, Wireless, and Software-Defined Networking (SDN) Network Traffics
(Turkiye Klinikleri, 2024-07-26) Hacilar, Hilal; Aydin, Zafer; Güngör, Vehbi Çağrı
The rapid growth of computer networks emphasizes the urgency of addressing security issues. Organizations rely on network intrusion detection systems (NIDSs) to protect sensitive data from unauthorized access and theft. These systems analyze network traffic to detect suspicious activities, such as attempted breaches or cyberattacks. However, existing studies lack a thorough assessment of class imbalances and classification performance for different types of network intrusions: wired, wireless, and software-defined networking (SDN). This research aims to fill this gap by examining these networks’ imbalances, feature selection, and binary classification to enhance intrusion detection system efficiency. Various techniques such as SMOTE, ROS, ADASYN, and SMOTETomek are used to handle imbalanced datasets. Additionally, eXtreme Gradient Boosting (XGBoost) identifies key features, and an autoencoder (AE) assists in feature extraction for the classification task. The study evaluates datasets such as AWID, UNSW, and InSDN, yielding the best results with different numbers of selected features. Bayesian optimization fine-tunes parameters, and diverse machine learning algorithms (SVM, kNN, XGBoost, random forest, ensemble classifiers, and autoencoders) are employed. The optimal results, considering F1-measure, overall accuracy, detection rate, and false alarm rate, have been achieved for the UNSW-NB15, preprocessed AWID, and InSDN datasets, with values of [0.9356, 0.9289, 0.9328, 0.07597], [0.997, 0.9995, 0.9999, 0.0171], and [0.9998, 0.9996, 0.9998, 0.0012], respectively. These findings demonstrate that combining Bayesian optimization with oversampling techniques significantly enhances classification performance across wired, wireless, and SDN networks when compared to previous research conducted on these datasets. © 2024 Elsevier B.V., All rights reserved.
Citation - Scopus: 5
Identifying Taxonomic Biomarkers of Colorectal Cancer in Human Intestinal Microbiota Using Multiple Feature Selection Methods
(Institute of Electrical and Electronics Engineers Inc., 2022-09-07) Jabeer, Amhar; Kocak, Aysegul; Akkaş, Huseyin; Yenisert, Ferhan; Nalbantoĝlu, Özkan Ufuk; Yousef, Malik; Bakir-Güngör, Burcu; Bakir Gungor, Burcu
A variety of bacterial species called gut microbiota work together to maintain a steady intestinal environment. The gastrointestinal tract contains tremendous amount of different species including archaea, bacteria, fungi, and viruses. While these organisms are crucial immune system stabilizers, the dysbiosis of the intestinal flora has been related to gastrointestinal disorders including Colorectal cancer (CRC), intestinal cancer, irritable bowel syndrome and inflammatory bowel disease. In the last decade, next-generation sequencing (NGS) methods have accelerated the identification of human gut flora. CRC is a deathly condition that has been on the rise in the last century, affecting half a million people each year. Since early CRC diagnosis is critical for an effective treatment, there is an immediate requirement for a classification system that can expedite CRC diagnosis. In this study, via analyzing the available metagenomics data on CRC, we aim to facilitate the CRC diagnosis via finding biomarkers linked with CRC, and via building a classification model. We have obtained the metagenomic sequencing data of the healthy individuals and CRC patients from a metagenome-wide association analysis and we have classified this data according to the disease stages. Conditional Mutual Information Maximization (CMIM), Fast Correlation Based Filter (FCBF), Extreme Gradient Boosting (XGBoost), min redundancy max relevance (mRMR), Information Gain (IG) and Select K Best (SKB) feature selection algorithms were utilized to cope with the complexity of the features. We observed that the SKB, IG, and XGBoost techniques made significant contributions to decrease the microbiota in use for CRC diagnosis, thereby reducing cost and time. We realized that our Random Forest classifier outperformed Adaboost, Support Vector Machine, Decision Tree, Logitboost and stacking ensemble classifiers in terms of CRC classification performance. Our results reiterated some known and some potential microbiome associated mechanisms in CRC, which could aid the design of new diagnostics based on the microbiome. © 2022 Elsevier B.V., All rights reserved.
Citation - Scopus: 10
Building a Challenging Medical Dataset for Comparative Evaluation of Classifier Capabilities
(Elsevier Ltd, 2024-08) Bozkurt, Berat; Coskun, Kerem; Bakal, Gokhan
Since the 2000s, digitalization has been a crucial transformation in our lives. Nevertheless, digitalization brings a bulk of unstructured textual data to be processed, including articles, clinical records, web pages, and shared social media posts. As a critical analysis, the classification task classifies the given textual entities into correct categories. Categorizing documents from different domains is straightforward since the instances are unlikely to contain similar contexts. However, document classification in a single domain is more complicated due to sharing the same context. Thus, we aim to classify medical articles about four common cancer types (Leukemia, Non-Hodgkin Lymphoma, Bladder Cancer, and Thyroid Cancer) by constructing machine learning and deep learning models. We used 383,914 medical articles about four common cancer types collected by the PubMed API. To build classification models, we split the dataset into 70% as training, 20% as testing, and 10% as validation. We built widely used machine-learning (Logistic Regression, XGBoost, CatBoost, and Random Forest Classifiers) and modern deep-learning (convolutional neural networks - CNN, long short-term memory - LSTM, and gated recurrent unit - GRU) models. We computed the average classification performances (precision, recall, F-score) to evaluate the models over ten distinct dataset splits. The best-performing deep learning model(s) yielded a superior F1 score of 98%. However, traditional machine learning models also achieved reasonably high F1 scores, 95% for the worst-performing case. Ultimately, we constructed multiple models to classify articles, which compose a hard-to-classify dataset in the medical domain. © 2024 Elsevier B.V., All rights reserved.
Benchmarking CNN Architectures for Eye Disease Detection With Transfer Learning Techniques
(Institute of Electrical and Electronics Engineers Inc., 2025-06-27) Keles, Tolgahan; Aykanat, Muhammet Ali; Kurban, Rifat
In this study, convolutional neural networks (CNN)-based approaches were compared to classify eye diseases using transfer learning techniques. A series of data augmentation strategies, including random rotation, shifting, shearing, zooming, and horizontal flipping, were applied to increase the training data's robustness and diversity. Several state-of-the-art CNNs, including ResNet50, VGG19, EfficientNetB0, Xception, InceptionV3, DenseNet121, MobileNetV2, NASNetMobile, and ConvNeXtBase, were fine-tuned through transfer learning. During training, models were evaluated based on their accuracy, training time, and validation performance, while early stopping mechanisms were employed to prevent overfitting. Experimental results demonstrated that DenseNet121 achieved the highest validation accuracy (72%) during the training phase and the best test set performance with an accuracy of 68% and an AUC-ROC of 0.93. MobileNetV2, on the other hand, provided a strong balance between classification accuracy (65%) and low inference time (7.28 ms), making it appropriate for real-time uses. The findings highlight the importance of selecting appropriate architectures by considering both predictive performance and computational efficiency, particularly in the context of medical imaging, where real-world deployment constraints are critical. © 2025 Elsevier B.V., All rights reserved.
Citation - Scopus: 15
An Effective Colorectal Polyp Classification for Histopathological Images Based on Supervised Contrastive Learning
(Elsevier Ltd, 2024-04) Yengec-Tasdemir, Sena Busra; Aydin, Zafer; Akay, Ebru; Doǧan, Serkan; Yilmaz, Bulent
Early detection of colon adenomatous polyps is pivotal in reducing colon cancer risk. In this context, accurately distinguishing between adenomatous polyp subtypes, especially tubular and tubulovillous, from hyperplastic variants is crucial. This study introduces a cutting-edge computer-aided diagnosis system optimized for this task. Our system employs advanced Supervised Contrastive learning to ensure precise classification of colon histopathology images. Significantly, we have integrated the Big Transfer model, which has gained prominence for its exemplary adaptability to visual tasks in medical imaging. Our novel approach discerns between in-class and out-of-class images, thereby elevating its discriminatory power for polyp subtypes. We validated our system using two datasets: a specially curated one and the publicly accessible UniToPatho dataset. The results reveal that our model markedly surpasses traditional deep convolutional neural networks, registering classification accuracies of 87.1% and 70.3% for the custom and UniToPatho datasets, respectively. Such results emphasize the transformative potential of our model in polyp classification endeavors. © 2024 Elsevier B.V., All rights reserved.
Citation - Scopus: 19
A Novel Feature Design and Stacking Approach for Non-Technical Electricity Loss Detection
(Institute of Electrical and Electronics Engineers Inc., 2018-05) Aydin, Zafer; Güngör, Vehbi Çağrı
Non-technical electricity losses continue to jeopardize economic and social well-being of many countries. In this work, we develop machine learning classifiers that can identify anomalous electricity consumption in Turkey. Starting from weekly electricity usage data, we develop new features that capture statistical and frequency domain characteristics of the customers and their consumption patterns. We analyze the effect of reducing number of feature descriptors through dimensionality reduction and feature selection techniques. To overcome the class imbalance problem, we implement several ensemble methods and compare their prediction accuracy to those of the standard classifiers. The proposed features and combining strengths of different classifiers bring significant improvements on performance metrics, which is demonstrated through detailed simulations on shopping mall sector. We anticipate that advances in this field will contribute to the economies considerably. © 2018 Elsevier B.V., All rights reserved.
Citation - Scopus: 3
A Hybrid Adaptive Neuro-Fuzzy Inference System (ANFIS) Approach for Professional Bloggers Classification
(Institute of Electrical and Electronics Engineers Inc., 2019-11) Asim, Yousra; Raza, Basit; Malik, Ahmad Kamran Kamran; Shahid, Ahmad Raza; Faheem, Muhammed Yasir; Kumar, Y. J.
Despite their small numbers, some users of the online social networks demonstrate the ability to influence others. Bloggers are one of such kind of users that through their ideas and opinions on different topics, influence other users. Their identification may be beneficial for several purposes, such as online marketing for products. Much effort has been expanded towards finding the impact of such bloggers within the blogging community. We have expanded on their work by identifying influential bloggers using labeled data. We have improved upon the accuracy of the classification of professional and nonprofessional bloggers. We have made use of Adaptive Neuro-Fuzzy Inference System (ANFIS), and the Fuzzy Inference System (FIS) models. Their performance has been gauged and compared with the existing techniques and approaches, such as an Artificial Neural Network (ANN), Alternating Decision Tree (ADTree) algorithm, and Classification Based on Associations (CBA) algorithm. Adaptive techniques (ANFIS and ANN) are found better than the aforementioned rule-based classifiers. The FIS model outperformed the CBA algorithm, but showed similar performance to the ADTree algorithm. Our proposed ANFIS model showed improved results in terms of performance measures with 93% accuracy for blogger classification. © 2020 Elsevier B.V., All rights reserved.
Evaluation of Hybrid Classification Approaches: Case Studies on Credit Datasets
(Springer Verlag service@springer.de, 2018) Cetiner, Erkan; Güngör, Vehbi Çağrı; Kocak, Taskin
Hybrid classification approaches on credit domain are widely used to obtain valuable information about customer behaviours. Single classification algorithms such as neural networks, support vector machines and regression analysis have been used since years on related area. In this paper, we propose hybrid classification approaches, which try to combine several classifiers and ensemble learners to boost accuracy on classification results. We worked with two credit datasets, German dataset which is a public dataset and a Turkish Corporate Bank dataset. The goal of using such diverse datasets is to search for generalization ability of proposed model. Results show that feature selection plays a vital role on classification accuracy, hybrid approaches which shaped with ensemble learners outperform single classification techniques and hybrid approaches which consists SVM has better accuracy performance than other hybrid approaches. © 2018 Elsevier B.V., All rights reserved.
Citation - Scopus: 5
Emotion Detection Using Multivariate Synchrosqueezing Transform via 2D Circumplex Model
(Institute of Electrical and Electronics Engineers Inc., 2018) Ozel, Pinar; Akan, Aydin; Yilmaz, Bulent; Özel, Pınar; Akan, Aydin I.; Yilmaz, Bulent
Emotion detection by utilizing signal processing methods is a challenging area. An open issue in emotional modeling is to obtain an optimum feature set to use for the classification process. This study proposes an approach for emotional state classification by the investigation of EEG signals via multivariate synchrosqueezing transform (MSST). MSST is a post-processing technique to compose a localized time-frequency representation yielding multivariate syncyrosqueezing coefficients. After obtaining these coefficients from EEG signals for 18 subjects from DEAP dataset, coefficients and self-assessment-mannequins (SAM) labels of those subjects are used for emotional state classification by using support vector machines (SVM) nearest neighbor, decision tree, and ensemble methods. The accuracy rate is 70.6% for high valence high arousal (HVHA), 75.4% for low valence high arousal (LVHA), 77.8% for high valence low arousal (HVLA), and 77.2% for low valence low arousal (LVLA) cases using SVM. © 2019 Elsevier B.V., All rights reserved.

Scopus İndeksli Yayınlar Koleksiyonu

Browse

Filters

Settings

Sort By

Results per page

Search Results