Araştırma Çıktıları | TR-Dizin | WoS | Scopus | PubMed
Permanent URI for this communityhttps://hdl.handle.net/20.500.12573/393
Browse
Browsing Araştırma Çıktıları | TR-Dizin | WoS | Scopus | PubMed by Author "Abar, Orhan"
Now showing 1 - 3 of 3
- Results Per Page
- Sort Options
Article Citation - WoS: 4Citation - Scopus: 4Combining N-Grams and Graph Convolution for Text Classification(Elsevier, 2025) Sen, Tarik Uveys; Yakit, Mehmet Can; Gumus, Mehmet Semih; Abar, Orhan; Bakal, Gokhan; 01. Abdullah Gül University; 02. 04. Bilgisayar Mühendisliği; 02. Mühendislik FakültesiText classification, a cornerstone of natural language processing (NLP), finds applications in diverse areas, from sentiment analysis to topic categorization. While deep learning models have recently dominated the field, traditional n-gram-driven approaches often struggle to achieve comparable performance, particularly on large datasets. This gap largely stems from deep learning' s superior ability to capture contextual information through word embeddings. This paper explores a novel approach to leverage the often-overlooked power of n-gram features for enriching word representations and boosting text classification accuracy. We propose a method that transforms textual data into graph structures, utilizing discriminative n-gram series to establish long-range relationships between words. By training a graph convolution network on these graphs, we derive contextually enhanced word embeddings that encapsulate dependencies extending beyond local contexts. Our experiments demonstrate that integrating these enriched embeddings into an long-short term memory (LSTM) model for text classification leads to around 2% improvements in classification performance across diverse datasets. This achievement highlights the synergy of combining traditional n-gram features with graph-based deep learning techniques for building more powerful text classifiers.Conference Object Citation - Scopus: 8On Comparative Classification of Relevant COVID-19 Tweets(Institute of Electrical and Electronics Engineers Inc., 2021) Bakal, Gokhan; Abar, Orhan; 01. Abdullah Gül University; 02. 04. Bilgisayar Mühendisliği; 02. Mühendislik FakültesiDue to the impressive information dissemination power of social networks such as Twitter, people tend to check social networks and Web pages more than other traditional news sources, including newspapers, TV news programs, or radio channels. In that sense, the information carried by the content of the shared social media posts becomes much more considerable. However, most of the posts are commonly either irrelevant or inaccurate. Besides, the more critical case than the correctness of the information is the diffusion speed on Twitter through the reply or retweet actions. These activities make the initial situation even more complicated than itself due to the unregulated nature of the social networks and the lack of an immediate verification mechanism for the correctness of the posts. When we consider the current Covid-19 pandemic period (causing the coronavirus disease), one of the most utilized information resources is Twitter except the official health administration institutions. Thereupon, examining the correctness of the information related to the Covid-19 pandemic by computational techniques (e.g., Data Mining, Machine Learning, and Deep Learning) has been gaining popularity and remains a substantial task. Hence, we mainly focused on analyzing the correctness of the posts related to the current pandemic shared on the Twitter platform. Therefore, the overall goal of this work is to classify the relevant tweets using linear and non-linear machine learning models. We achieved the best F1 performance score (99%) with the neural network model using the unigram features & threshold value of 50 among all model configurations. © 2022 Elsevier B.V., All rights reserved.Conference Object Text Classification Experiments on Contextual Graphs Built by N-Gram Series(Springer International Publishing AG, 2025) Sen, Tarik Uveys; Yakit, Mehmet Can; Gumus, Mehmet Semih; Abar, Orhan; Bakal, Gokhan; 01. Abdullah Gül University; 02. 04. Bilgisayar Mühendisliği; 02. Mühendislik FakültesiTraditional n-gram textual features, commonly employed in conventional machine learning models, offer lower performance rates on high-volume datasets compared to modern deep learning algorithms, which have been intensively studied for the past decade. The main reason for this performance disparity is that deep learning approaches handle textual data through the word vector space representation by catching the contextually hidden information in a better way. Nonetheless, the potential of the n-gram feature set to reflect the context is open to further investigation. In this sense, creating graphs using discriminative ngram series with high classification power has never been fully exploited by researchers. Hence, the main goal of this study is to contribute to the classification power by including the long-range neighborhood relationships for each word in the word embedding representations. To achieve this goal, we transformed the textual data by employing n-gram series into a graph structure and then trained a graph convolution network model. Consequently, we obtained contextually enriched word embeddings and observed F1-score performance improvements from 0.78 to 0.80 when we integrated those convolution-based word embeddings into an LSTM model. This research contributes to improving classification capabilities by leveraging graph structures derived from discriminative n-gram series.
