Text Classification Experiments on Contextual Graphs Built by N-Gram Series

Loading...
Publication Logo

Date

2025

Journal Title

Journal ISSN

Volume Title

Publisher

Springer International Publishing AG

Open Access Color

Green Open Access

No

OpenAIRE Downloads

OpenAIRE Views

Publicly Funded

No
Impulse
Average
Influence
Average
Popularity
Average

Research Projects

Journal Issue

Abstract

Traditional n-gram textual features, commonly employed in conventional machine learning models, offer lower performance rates on high-volume datasets compared to modern deep learning algorithms, which have been intensively studied for the past decade. The main reason for this performance disparity is that deep learning approaches handle textual data through the word vector space representation by catching the contextually hidden information in a better way. Nonetheless, the potential of the n-gram feature set to reflect the context is open to further investigation. In this sense, creating graphs using discriminative ngram series with high classification power has never been fully exploited by researchers. Hence, the main goal of this study is to contribute to the classification power by including the long-range neighborhood relationships for each word in the word embedding representations. To achieve this goal, we transformed the textual data by employing n-gram series into a graph structure and then trained a graph convolution network model. Consequently, we obtained contextually enriched word embeddings and observed F1-score performance improvements from 0.78 to 0.80 when we integrated those convolution-based word embeddings into an LSTM model. This research contributes to improving classification capabilities by leveraging graph structures derived from discriminative n-gram series.

Description

Bakal, Mehmet/0000-0003-2897-3894

Keywords

Text-Graph Transformation, Graph Convolution Network, Deep Learning, Text Mining

Fields of Science

Citation

WoS Q

N/A

Scopus Q

Q4
OpenCitations Logo
OpenCitations Citation Count
N/A

Source

Communications in Computer and Information Science

Volume

2303

Issue

Start Page

312

End Page

326
PlumX Metrics
Citations

Scopus : 0

Captures

Mendeley Readers : 2

Google Scholar Logo
Google Scholar™
OpenAlex Logo
OpenAlex FWCI
0.0

Sustainable Development Goals