Combining N-Grams and Graph Convolution for Text Classification

dc.contributor.author Sen, Tarik Uveys
dc.contributor.author Yakit, Mehmet Can
dc.contributor.author Gumus, Mehmet Semih
dc.contributor.author Abar, Orhan
dc.contributor.author Bakal, Gokhan
dc.date.accessioned 2025-09-25T10:42:45Z
dc.date.available 2025-09-25T10:42:45Z
dc.date.issued 2025
dc.description Bakal, Mehmet/0000-0003-2897-3894 en_US
dc.description.abstract Text classification, a cornerstone of natural language processing (NLP), finds applications in diverse areas, from sentiment analysis to topic categorization. While deep learning models have recently dominated the field, traditional n-gram-driven approaches often struggle to achieve comparable performance, particularly on large datasets. This gap largely stems from deep learning' s superior ability to capture contextual information through word embeddings. This paper explores a novel approach to leverage the often-overlooked power of n-gram features for enriching word representations and boosting text classification accuracy. We propose a method that transforms textual data into graph structures, utilizing discriminative n-gram series to establish long-range relationships between words. By training a graph convolution network on these graphs, we derive contextually enhanced word embeddings that encapsulate dependencies extending beyond local contexts. Our experiments demonstrate that integrating these enriched embeddings into an long-short term memory (LSTM) model for text classification leads to around 2% improvements in classification performance across diverse datasets. This achievement highlights the synergy of combining traditional n-gram features with graph-based deep learning techniques for building more powerful text classifiers. en_US
dc.description.sponsorship Scientific and Technological Research Council of Turkiye (TUBITAK) [122E103] en_US
dc.description.sponsorship This research is funded by the Scientific and Technological Research Council of Turkiye (TUBITAK) through 3501 Career Development Program with grant number 122E103. The authors also express their gratitude to Google Cloud Services for providing academic credit support that facilitated portions of this work. en_US
dc.identifier.doi 10.1016/j.asoc.2025.113092
dc.identifier.issn 1568-4946
dc.identifier.issn 1872-9681
dc.identifier.scopus 2-s2.0-105001822010
dc.identifier.uri https://doi.org/10.1016/j.asoc.2025.113092
dc.identifier.uri https://hdl.handle.net/20.500.12573/3479
dc.language.iso en en_US
dc.publisher Elsevier en_US
dc.relation.ispartof Applied Soft Computing en_US
dc.rights info:eu-repo/semantics/closedAccess en_US
dc.subject Text-Graph Transformation en_US
dc.subject Graph Convolution Network en_US
dc.subject Deep Learning en_US
dc.subject Text Mining en_US
dc.subject Graph Mining en_US
dc.title Combining N-Grams and Graph Convolution for Text Classification en_US
dc.type Article en_US
dspace.entity.type Publication
gdc.author.id Bakal, Mehmet/0000-0003-2897-3894
gdc.author.scopusid 58572643100
gdc.author.scopusid 59702905000
gdc.author.scopusid 59703094600
gdc.author.scopusid 57192980580
gdc.author.scopusid 57074041500
gdc.author.wosid Abar, Orhan/Lrt-9029-2024
gdc.author.wosid Bakal, Mehmet Gokhan/Aat-2797-2020
gdc.bip.impulseclass C4
gdc.bip.influenceclass C5
gdc.bip.popularityclass C4
gdc.coar.access metadata only access
gdc.coar.type text::journal::journal article
gdc.collaboration.industrial false
gdc.description.department Abdullah Gül University en_US
gdc.description.departmenttemp [Sen, Tarik Uveys; Yakit, Mehmet Can; Bakal, Gokhan] Abdullah Gul Univ, Dept Comp Engn, Erkilet Blvd Sumer Campus, TR-38080 Kayseri, Turkiye; [Gumus, Mehmet Semih; Abar, Orhan] Osmaniye Korkut Ata Univ, Dept Comp Engn, Karacaoglan Campus, TR-80000 Osmaniye, Turkiye en_US
gdc.description.publicationcategory Makale - Uluslararası Hakemli Dergi - Kurum Öğretim Elemanı en_US
gdc.description.scopusquality Q1
gdc.description.volume 175 en_US
gdc.description.woscitationindex Science Citation Index Expanded
gdc.description.wosquality Q1
gdc.identifier.openalex W4409202140
gdc.identifier.wos WOS:001464755900001
gdc.index.type WoS
gdc.index.type Scopus
gdc.oaire.diamondjournal false
gdc.oaire.impulse 5.0
gdc.oaire.influence 2.7670268E-9
gdc.oaire.isgreen false
gdc.oaire.popularity 6.558155E-9
gdc.oaire.publicfunded false
gdc.openalex.collaboration National
gdc.openalex.fwci 14.4213
gdc.openalex.normalizedpercentile 0.99
gdc.openalex.toppercent TOP 10%
gdc.opencitations.count 3
gdc.plumx.crossrefcites 4
gdc.plumx.mendeley 12
gdc.plumx.scopuscites 4
gdc.scopus.citedcount 4
gdc.virtual.author Bakal, Mehmet Gökhan
gdc.wos.citedcount 3
relation.isAuthorOfPublication 53ed538c-20d9-45c8-af59-7fa4d1b90cf7
relation.isAuthorOfPublication.latestForDiscovery 53ed538c-20d9-45c8-af59-7fa4d1b90cf7
relation.isOrgUnitOfPublication 665d3039-05f8-4a25-9a3c-b9550bffecef
relation.isOrgUnitOfPublication 52f507ab-f278-4a1f-824c-44da2a86bd51
relation.isOrgUnitOfPublication ef13a800-4c99-4124-81e0-3e25b33c0c2b
relation.isOrgUnitOfPublication.latestForDiscovery 665d3039-05f8-4a25-9a3c-b9550bffecef

Files