Semant - Feature Group Selection Utilizing Fasttext-Based Semantic Word Grouping, Scoring, and Modeling Approach for Text Classification

Loading...
Publication Logo

Date

2024

Authors

Voskergian, Daniel
Bakir-Gungor, Burcu
Yousef, Malik

Journal Title

Journal ISSN

Volume Title

Publisher

Springer International Publishing AG

Open Access Color

Green Open Access

No

OpenAIRE Downloads

OpenAIRE Views

Publicly Funded

No
Impulse
Average
Influence
Average
Popularity
Average

Research Projects

Journal Issue

Abstract

Text classification presents a challenge due to its high-dimensional feature space. As such, devising an effective feature selection scheme is essential. In this study, we present SEMANT, a novel hybrid filter-wrapper feature selection method that utilizes filter-based Chi-Square and the wrapper-based G-S-M approach. SEMANT incorporates fastText neural word embedding similarities to promote greater semantic inclusion in the selection of features for text classification tasks. The performance of the proposed method was investigated on the WOS-5736 and LitCovid datasets and compared with TextNetTopics, a topic modeling-based topic selection algorithm for text classification. Experimental results confirm that the proposed approach outperforms its alternative.

Description

Voskergian, Daniel/0009-0005-7544-9210; Bakir-Gungor, Burcu/0000-0002-2272-6270; Yousef, Malik/0000-0001-8780-6303

Keywords

Feature Grouping, Feature Group Selection, Hybrid Feature Selection, Machine Learning, Text Classification, Word Embedding, Semantics

Fields of Science

Citation

WoS Q

N/A

Scopus Q

Q3
OpenCitations Logo
OpenCitations Citation Count
N/A

Source

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

Volume

14911

Issue

Start Page

69

End Page

75
PlumX Metrics
Citations

Scopus : 1

Captures

Mendeley Readers : 1

Google Scholar Logo
Google Scholar™
OpenAlex Logo
OpenAlex FWCI
0.0

Sustainable Development Goals