Scopus İndeksli Yayınlar Koleksiyonu
Permanent URI for this collectionhttps://hdl.handle.net/20.500.12573/395
Browse
5 results
Search Results
Conference Object The Effect of Different Classifiers on Recursive Cluster Elimination in the Analysis of Transcriptomic Data(Institute of Electrical and Electronics Engineers Inc., 2023-10-11) Bulut, Nurten; Bakir-Güngör, Burcu; Qaqish, Bahjat F.; Yousef, MalikGene expression data with limited sample size and a large number of genes are frequently encountered in genetic studies. In such high-dimensional data, identification of genes that distinguish between disease states is a challenging task. Feature selection (FS) is a useful approach in dealing with high dimensionality. Support Vector Machines Recursive Cluster Elimination (SVM-RCE) is a technique for FS in high-dimensional data. The SVM-RCE approach has been utilized for identification of clusters of genes whose expression levels correlate with pathological state. A key step in SVM-RCE is the use of an SVM classifier to assign an area under the curve (AUC) score to each gene cluster based on its ability to predict class labels. In this study, we investigate the use of alternative classifiers in the cluster-scoring step. Specifically, we compare Support Vector Machines, Random Forest, XgBoost, Naive Bayes, and linear logistic regression. In addition to AUC score performance evaluation, the algorithms are compared in terms of the number of selected genes at different levels of clustering and in terms of the running time. © 2023 Elsevier B.V., All rights reserved.Article Citation - Scopus: 8Building a Challenging Medical Dataset for Comparative Evaluation of Classifier Capabilities(Elsevier Ltd, 2024-08) Bozkurt, Berat; Coskun, Kerem; Bakal, GokhanSince the 2000s, digitalization has been a crucial transformation in our lives. Nevertheless, digitalization brings a bulk of unstructured textual data to be processed, including articles, clinical records, web pages, and shared social media posts. As a critical analysis, the classification task classifies the given textual entities into correct categories. Categorizing documents from different domains is straightforward since the instances are unlikely to contain similar contexts. However, document classification in a single domain is more complicated due to sharing the same context. Thus, we aim to classify medical articles about four common cancer types (Leukemia, Non-Hodgkin Lymphoma, Bladder Cancer, and Thyroid Cancer) by constructing machine learning and deep learning models. We used 383,914 medical articles about four common cancer types collected by the PubMed API. To build classification models, we split the dataset into 70% as training, 20% as testing, and 10% as validation. We built widely used machine-learning (Logistic Regression, XGBoost, CatBoost, and Random Forest Classifiers) and modern deep-learning (convolutional neural networks - CNN, long short-term memory - LSTM, and gated recurrent unit - GRU) models. We computed the average classification performances (precision, recall, F-score) to evaluate the models over ten distinct dataset splits. The best-performing deep learning model(s) yielded a superior F1 score of 98%. However, traditional machine learning models also achieved reasonably high F1 scores, 95% for the worst-performing case. Ultimately, we constructed multiple models to classify articles, which compose a hard-to-classify dataset in the medical domain. © 2024 Elsevier B.V., All rights reserved.Conference Object Citation - Scopus: 21Assessing Employee Attrition Using Classifications Algorithms(Association for Computing Machinery, 2020-05-15) Ozdemir, Fatma; Cos¸kun, Mustafa; Gezer, Cengiz; Güngör, Vehbi Çağrı; Coskun, Mustafa; Cagri Gungor, V.Employees leave an organization when other organizations offer better opportunities than their current organizations. Continuity and sustenance and even completion of jobs are crucial issues for the companies not to suffer financial losses. Especially if the talented employees, who are at critical positions in the companies, leave the job, it becomes difficult for the organizations to maintain their businesses. Today, organizations would like to predict attrition of their employees and plan and prepare for it. However, the HR departments of organizations are not advanced enough to make such predictions in a handcrafted manner. For this reason, organizations are looking for new systems or methods that automatize the prediction of employee attrition utilizing data mining methods. In this study, we use IBM HR data set and apply different classification methods, such as Support Vector Machine (SVM), Random Forest, J48, LogitBoost, Multilayer Perceptron (MLP), K-Nearest Neighbors (KNN), Linear Discriminant Analysis (LDA), Naive Bayes, Bagging, AdaBoost, Logistic Regression, to predict the employee attrition. Different from exiting studies, we systematically evaluate our findings with various classification metrics, such as F-measure, Area Under Curve, accuracy, sensitivity, and specificity. We observe that data mining methods can be useful for predicting the employee attrition. © 2022 Elsevier B.V., All rights reserved.Article Citation - WoS: 25Citation - Scopus: 41An Efficient Network Intrusion Detection Approach Based on Logistic Regression Model and Parallel Artificial Bee Colony Algorithm(Elsevier, 2024-04) Kolukisa, Burak; Dedeturk, Bilge Kagan; Hacilar, Hilal; Gungor, Vehbi CagriIn recent years, the widespread use of the Internet has created many issues, especially in the area of cybersecurity. It is critical to detect intrusions in network traffic, and researchers have developed network intrusion and anomaly detection systems to cope with high numbers of attacks and attack variations. In particular, machine learning and meta-heuristic methods have been widely used for network intrusion detection systems (NIDS). However, existing studies on these systems usually suffer from low performance results such as accuracy, F1-measure, false positive rate, and false negative rate, and generally do not use automatic parameter tuning techniques. To address these challenges, this study proposes a novel approach based on a logistic regression model trained using a parallel artificial bee colony (LR-ABC) algorithm with a hyper-parameter optimization technique. The performance of the proposed model is evaluated against state -of-the-art machine learning and deep learning models on two publicly available NIDS datasets. Comparative performance evaluations show that the proposed method achieved satisfactory results with accuracy of 88.25% on the UNSW-NB15 dataset and 90.11% on the NSL-KDD dataset, and F1-measures of 88.26% and 90.15%, respectively. These findings demonstrate the efficacy of the proposed LR-ABC model in enhancing the accuracy and reliability, while providing a scalable solution to adapt to the dynamic and evolving landscape of cybersecurity threats.Conference Object Citation - Scopus: 4A Transfer Learning Application on the Reliability of Psychological Drugs' Comments(Institute of Electrical and Electronics Engineers Inc., 2023-07-25) Sen, Tarik Uveys; Bakal, GokhanAs digitalization and the Internet stay emerging concepts by gaining popularity, the accuracy of personal reviews/opinions will be a critical issue. This circumstance also particularly applies to patients taking psychological drugs, where accurate information is crucial for other patients and medical professionals. In this study, we analyze drug reviews from drugs.com to determine the effectiveness of reviews for psychological drugs. Our dataset includes over 200,000 drug reviews, which we labeled as positive, negative, or neutral according to their rating scores. We apply machine learning (ML) models, including Logistic Regression, Recurrent Neural Network (RNN), and Long Short-Term Memory (LSTM) algorithms, to predict the sentiment class of each review. Our results demonstrate an F1-Weighted score of 85.3% for the LSTM model. However, by applying the transfer learning technique, we further improved the F1 score (nearly 3% increase) obtained by the LSTM model. Our findings proved that there is no contextual difference between the comments made by the patients suffering from psychological or other diseases. © 2023 Elsevier B.V., All rights reserved.
