WoS İndeksli Yayınlar Koleksiyonu
Permanent URI for this collectionhttps://hdl.handle.net/20.500.12573/394
Browse
3 results
Search Results
Conference Object Citation - Scopus: 1NLP-Driven Fake News Detection: A Machine Learning Perspective(IEEE, 2025-05-23) Coban, Mert Korkut; Bakal, GokhanThe rapid spread of fake news poses a significant challenge, impacting public opinion, decision-making, and societal trust. This study explores the application of Natural Language Processing (NLP) and Machine Learning (ML) techniques for robust fake news detection. Using datasets such as ISOT Fake News, WELFake, and Football Fake News, the project employs advanced preprocessing methods and feature extraction techniques, including TF-IDF, Word2Vec, and GloVe. A comprehensive evaluation of machine learning models-Random Forest, Support Vector Machines (SVM), and Neural Networks-was conducted to identify the optimal configuration. Results demonstrate that Random Forest with TF-IDF excels in in-domain detection, achieving an F1-score of 99.70%, while Neural Networks paired with Word2Vec and GloVe embeddings outperform in cross-dataset scenarios. The study highlights the importance of dataset size, domain relevance, and feature representation in achieving high generalizability. These findings provide a scalable framework for combating misinformation on digital platforms.Article Citation - WoS: 2Machine Learning Based Network Intrusion Detection With Hybrid Frequent Item Set Mining(Gazi Univ, 2024-10-02) Firat, Murat; Bakal, Gokhan; Akbas, Ayhan; Bakal, MehmetWith the development and expansion of computer networks day by day and the diversity of software developed, the damage that possible attacks can cause is increasing beyond the predictions. Intrusion Detection Systems (STS/IDS) are one of the practical defense tools against these potential attacks that are constantly growing and diversifying. Thus, one of the emerging methods among researchers is to train these systems with various artificial intelligence methods to detect subsequent attacks in real time and take the necessary precautions. However, the ultimate goal is to propose a hybrid feature selection approach to improve the classification performance. The raw dataset originally enclosed 85 descriptor features (attributes) for classification. These attributes are extracted using CICFlowMeter from a PCAP file where network traffic is recorded for data curation. In this study, classical feature selection methods and frequent item set mining approaches were employed in feature selection for constructing a hybrid model. We aimed to examine the effect of the proposed hybrid feature selection approach on the classification task for the network traffic data containing ordinary and attack records. The outcomes demonstrate that the proposed method gained nearly 3% improvement when applied with the Logistic Regression algorithm on classifying more than 225,000 records.Article Citation - WoS: 16Citation - Scopus: 11An Empirical Study of Sentiment Analysis Utilizing Machine Learning and Deep Learning Algorithms(Springernature, 2023-12-09) Erkantarci, Betul; Bakal, GokhanAmong text-mining studies, one of the most studied topics is the text classification task applied in various domains, including medicine, social media, and academia. As a sub-problem in text classification, sentiment analysis has been widely investigated to classify often opinion-based textual elements. Specifically, user reviews and experiential feedback for products or services have been employed as fundamental data sources for sentiment analysis efforts. As a result of rapidly emerging technological advancements, social media platforms such as Twitter, Facebook, and Reddit, have become central opinion-sharing mediums since the early 2000s. In this sense, we build various machine-learning models to solve the sentiment analysis problem on the Reddit comments dataset in this work. The experimental models we constructed achieve F1 scores within intervals of 73-76%. Consequently, we present comparative performance scores obtained by traditional machine learning and deep learning models and discuss the results.
