Semant - Feature Group Selection Utilizing Fasttext-Based Semantic Word Grouping, Scoring, and Modeling Approach for Text Classification
Loading...

Date
2024
Authors
Voskergian, Daniel
Bakir-Gungor, Burcu
Yousef, Malik
Journal Title
Journal ISSN
Volume Title
Publisher
Springer International Publishing AG
Open Access Color
Green Open Access
No
OpenAIRE Downloads
OpenAIRE Views
Publicly Funded
No
Abstract
Text classification presents a challenge due to its high-dimensional feature space. As such, devising an effective feature selection scheme is essential. In this study, we present SEMANT, a novel hybrid filter-wrapper feature selection method that utilizes filter-based Chi-Square and the wrapper-based G-S-M approach. SEMANT incorporates fastText neural word embedding similarities to promote greater semantic inclusion in the selection of features for text classification tasks. The performance of the proposed method was investigated on the WOS-5736 and LitCovid datasets and compared with TextNetTopics, a topic modeling-based topic selection algorithm for text classification. Experimental results confirm that the proposed approach outperforms its alternative.
Description
Voskergian, Daniel/0009-0005-7544-9210; Bakir-Gungor, Burcu/0000-0002-2272-6270; Yousef, Malik/0000-0001-8780-6303
Keywords
Feature Grouping, Feature Group Selection, Hybrid Feature Selection, Machine Learning, Text Classification, Word Embedding, Semantics
Fields of Science
Citation
WoS Q
N/A
Scopus Q
Q3

OpenCitations Citation Count
N/A
Source
Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume
14911
Issue
Start Page
69
End Page
75
PlumX Metrics
Citations
Scopus : 1
Captures
Mendeley Readers : 1
Google Scholar™


