Search Results

Now showing 1 - 3 of 3

Text Classification Experiments on Contextual Graphs Built by N-Gram Series
(Springer International Publishing AG, 2025) Sen, Tarik Uveys; Yakit, Mehmet Can; Gumus, Mehmet Semih; Abar, Orhan; Bakal, Gokhan
Traditional n-gram textual features, commonly employed in conventional machine learning models, offer lower performance rates on high-volume datasets compared to modern deep learning algorithms, which have been intensively studied for the past decade. The main reason for this performance disparity is that deep learning approaches handle textual data through the word vector space representation by catching the contextually hidden information in a better way. Nonetheless, the potential of the n-gram feature set to reflect the context is open to further investigation. In this sense, creating graphs using discriminative ngram series with high classification power has never been fully exploited by researchers. Hence, the main goal of this study is to contribute to the classification power by including the long-range neighborhood relationships for each word in the word embedding representations. To achieve this goal, we transformed the textual data by employing n-gram series into a graph structure and then trained a graph convolution network model. Consequently, we obtained contextually enriched word embeddings and observed F1-score performance improvements from 0.78 to 0.80 when we integrated those convolution-based word embeddings into an LSTM model. This research contributes to improving classification capabilities by leveraging graph structures derived from discriminative n-gram series.
Citation - Scopus: 1
NLP-Driven Fake News Detection: A Machine Learning Perspective
(IEEE, 2025-05-23) Coban, Mert Korkut; Bakal, Gokhan
The rapid spread of fake news poses a significant challenge, impacting public opinion, decision-making, and societal trust. This study explores the application of Natural Language Processing (NLP) and Machine Learning (ML) techniques for robust fake news detection. Using datasets such as ISOT Fake News, WELFake, and Football Fake News, the project employs advanced preprocessing methods and feature extraction techniques, including TF-IDF, Word2Vec, and GloVe. A comprehensive evaluation of machine learning models-Random Forest, Support Vector Machines (SVM), and Neural Networks-was conducted to identify the optimal configuration. Results demonstrate that Random Forest with TF-IDF excels in in-domain detection, achieving an F1-score of 99.70%, while Neural Networks paired with Word2Vec and GloVe embeddings outperform in cross-dataset scenarios. The study highlights the importance of dataset size, domain relevance, and feature representation in achieving high generalizability. These findings provide a scalable framework for combating misinformation on digital platforms.
Graph-Based Biomedical Knowledge Discovery
(IEEE, 2024-05-15) Altuner, Osman; Bakir-Gungor, Burcu; Bakal, Gokhan
The digitalization process is progressing at a very high speed all over the world. While this situation provides many conveniences in today's life, it also brings along a problem such as analyzing and processing the huge digital data. This also applies to published academic studies. In this sense, the process of evaluating each study to access previously unknown information within the studies requires a very laborious process. For this reason, in this study, the publications obtained for the target diseases were analyzed by text analysis processes and converted into a graph structure that enables the linking of meaningful terms through biomedical relationships. On the dense graph structure obtained, binary biomedical entities with important links such as treats, causes, associated_with were queried. The entity pairs obtained according to the query results were also confirmed by manual search method and proved to be real connections. In this study, retrieval of known biomedical entities with the proposed approach solved the time-consuming manual search problem. There is also the potential to obtain unknown/unexplored possible new relationships (e.g., therapeutic, causal, etc.) with multiple binary linking patterns.

WoS İndeksli Yayınlar Koleksiyonu

Browse

Filters

Settings

Sort By

Results per page

Search Results