eTNT: Enhanced TextNetTopics with Filtered LDA Topics and Sequential Forward / Backward Topic Scoring Approaches

dc.contributor.author Voskergian, Daniel
dc.contributor.author Jayousi, Rashid
dc.contributor.author Bakir-Gungor, Burcu
dc.contributor.authorID 0000-0002-2272-6270 en_US
dc.contributor.department AGÜ, Mühendislik Fakültesi, Bilgisayar Mühendisliği Bölümü en_US
dc.contributor.institutionauthor Bakir-Gungor, Burcu
dc.date.accessioned 2025-04-28T06:48:57Z
dc.date.available 2025-04-28T06:48:57Z
dc.date.issued 2024 en_US
dc.description.abstract TextNetTopics is a novel text classification-based topic modelling approach that focuses on topic selection rather than individual word selection to train a machine learning algorithm. However, one key limitation of TextNetTopics is its scoring component, which evaluates each topic in isolation and ranks them accordingly, ignoring the potential relationships between topics. In addition, the chosen topics may contain redundant or irrelevant features, potentially increasing the feature set size and introducing noise that can degrade the overall model performance. To address these limitations and improve the classification performance, this study introduces an enhancement to TextNetTopics. eTNT integrates two novel scoring approaches: Sequential Forward Topic Scoring (SFTS) and Sequential Backward Topic Scoring (SBTS), which consider topic interactions by assessing sets of topics simultaneously. Moreover, it incorporates a filtering component that aims to enhance topics' quality and discriminative power by removing non-informative features from each topic using Random Forest feature importance values. These integrations aim to streamline the topic selection process and enhance classifier efficiency for text classification. The results obtained from the WOS-5736, LitCovid, and MultiLabel datasets provide valuable insights into the superior effectiveness of eTNT compared to its counterpart, TextNetTopics. en_US
dc.identifier.endpage 1144 en_US
dc.identifier.issn 2158-107X
dc.identifier.issn 2156-5570
dc.identifier.issue 7 en_US
dc.identifier.startpage 1135 en_US
dc.identifier.uri https://doi.org/10.14569/ijacsa.2024.01507110
dc.identifier.uri https://hdl.handle.net/20.500.12573/2511
dc.identifier.volume 15 en_US
dc.language.iso eng en_US
dc.publisher SCIENCE & INFORMATION-SAI ORGANIZATION LTD en_US
dc.relation.isversionof 10.14569/ijacsa.2024.01507110 en_US
dc.relation.journal INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS en_US
dc.relation.publicationcategory Makale - Uluslararası Hakemli Dergi - Kurum Öğretim Elemanı en_US
dc.rights info:eu-repo/semantics/openAccess en_US
dc.subject Topic scoring en_US
dc.subject Topic modeling en_US
dc.subject Text classification en_US
dc.subject Machine learning en_US
dc.title eTNT: Enhanced TextNetTopics with Filtered LDA Topics and Sequential Forward / Backward Topic Scoring Approaches en_US
dc.type article en_US

Files

Original bundle

Now showing 1 - 1 of 1
No Thumbnail Available
Name:
Paper_110-eTNT_Enhanced_TextNetTopics_with_Filtered_LDA_Topics.pdf
Size:
1.3 MB
Format:
Adobe Portable Document Format
Description:
Makale Dosyası

License bundle

Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
1.44 KB
Format:
Item-specific license agreed upon to submission
Description: