Document Classification With Contextually Enriched Word Embeddings

Akbaş, Ayhan; Mahmood, Raad; Bakal, Mehmet

doi:10.17694/bajece.1366812

Document Classification With Contextually Enriched Word Embeddings

Files

tmp-273a713f-6ca4-4fd7-bb28-99e1562eb9db.a4bc6baf64f84244b0105296ffe6d4f1.pdf.pdf (849.88 KB)

Date

2024-03-01

Authors

Akbaş, Ayhan

Mahmood, Raad

Bakal, Mehmet

Open Access Color

GOLD

Green Open Access

Yes

Publicly Funded

No

Impulse

Average

Influence

Average

Popularity

Average

Abstract

The text classification task has a wide range of application domains for distinct purposes, such as the classification of articles, social media posts, and sentiments. As a natural language processing application, machine learning and deep learning techniques are intensively utilized in solving such challenges. One common approach is employing the discriminative word features comprising Bag-of-Words and n-grams to conduct text classification experiments. The other powerful approach is exploiting neural network-based (specifically deep learning models) through either sentence, word, or character levels. In this study, we proposed a novel approach to classify documents with contextually enriched word embeddings powered by the neighbor words accessible through the trigram word series. In the experiments, a well-known web of science dataset is exploited to demonstrate the novelty of the models. Consequently, we built various models constructed with and without the proposed approach to monitor the models' performances. The experimental models showed that the proposed neighborhood-based word embedding enrichment has decent potential to use in further studies.

ORCID

0000-0002-6425-104X

0000-0003-0879-1989

0000-0003-2897-3894

Keywords

Bilgisayar Bilimleri, Yapay Zeka, Deep Learning, Text classification, N-grams, Word2Vec, LSTM

Fields of Science

05 social sciences, 0202 electrical engineering, electronic engineering, information engineering, 0501 psychology and cognitive sciences, 02 engineering and technology

WoS Q

N/A

Scopus Q

N/A

OpenCitations Citation Count

N/A

Source

Balkan Journal of Electrical and Computer Engineering

Volume

12

Issue

1

Start Page

90

End Page

97

URI

https://doi.org/10.17694/bajece.1366812
https://search.trdizin.gov.tr/en/yayin/detay/1254159/document-classification-with-contextually-enriched-word-embeddings
https://hdl.handle.net/20.500.12573/3649
https://search.trdizin.gov.tr/en/yayin/detay/1254159

Collections

TR-Dizin İndeksli Yayınlar Koleksiyonu

PlumX Metrics

Captures

Mendeley Readers : 5

Full item page

Google Scholar™

Check

Document Classification With Contextually Enriched Word Embeddings

Files

Date

Authors

Journal Title

Journal ISSN

Volume Title

Publisher

Open Access Color

Green Open Access

OpenAIRE Downloads

OpenAIRE Views

Publicly Funded

BIP! Indicators

relationships.isProjectOf

relationships.isJournalIssueOf

Abstract

Description

ORCID

Keywords

Fields of Science

Citation

WoS Q

Scopus Q

OpenCitations Citation Count

Source

Volume

Issue

Start Page

End Page

URI

Collections

PlumX Metrics

Captures

Google Scholar™

OpenAlex FWCI

0.00

Sustainable Development Goals