Document Classification with Contextually Enriched Word Embeddings

Mahmood, Raad Saadi; Bakal, Gokhan; Akbas, Ayhan

Document Classification with Contextually Enriched Word Embeddings

Files

tmp-273a713f-6ca4-4fd7-bb28-99e1562eb9db.a4bc6baf64f84244b0105296ffe6d4f1.pdf.pdf (912.11 KB)

Date

2024

Authors

Mahmood, Raad Saadi

Bakal, Gokhan

Akbas, Ayhan

Publisher

Bajece (İstanbul Teknik Ünv)

Abstract

The text classification task has a wide range of application domains for distinct purposes, such as the classification of articles, social media posts, and sentiments. As a natural language processing application, machine learning and deep learning techniques are intensively utilized in solving such challenges. One common approach is employing the discriminative word features comprising Bag-of-Words and n-grams to conduct text classification experiments. The other powerful approach is exploiting neural network-based (specifically deep learning models) through either sentence, word, or character levels. In this study, we proposed a novel approach to classify documents with contextually enriched word embeddings powered by the neighbor words accessible through the trigram word series. In the experiments, a well-known web of science dataset is exploited to demonstrate the novelty of the models. Consequently, we built various models constructed with and without the proposed approach to monitor the models' performances. The experimental models showed that the proposed neighborhood-based word embedding enrichment has decent potential to use in further studies.

Keywords

Text classification, Deep Learning, LSTM, Word2Vec, Word2Vec, N-grams

Volume

12

Issue

1

Start Page

90

End Page

97

URI

https://doi.org/10.17694/bajece.1366812
https://hdl.handle.net/20.500.12573/2435

Collections

Bilgisayar Mühendisliği Bölümü Koleksiyonu
TR-Dizin İndeksli Yayınlar Koleksiyonu

Full item page

Document Classification with Contextually Enriched Word Embeddings

Files

Date

Authors

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

Description

Keywords

Turkish CoHE Thesis Center URL

Citation

WoS Q

Scopus Q

Source

Volume

Issue

Start Page

End Page

URI

Collections