Beyond visual cues: Emotion recognition in images with text-aware fusion☆

dc.contributor.author Sungur, Kerim Serdar
dc.contributor.author Bakal, Gokhan
dc.contributor.authorID 0000-0003-2897-3894 en_US
dc.contributor.department AGÜ, Mühendislik Fakültesi, Bilgisayar Mühendisliği Bölümü en_US
dc.contributor.institutionauthor Sungur, Kerim Serdar
dc.contributor.institutionauthor Bakal, Gokhan
dc.date.accessioned 2025-03-10T09:05:24Z
dc.date.available 2025-03-10T09:05:24Z
dc.date.issued 2025 en_US
dc.description.abstract Sentiment analysis is a widely studied problem for understanding human emotions and potential outcomes. As it can be performed over textual data, working on visual data elements is also critically substantial to examining the current emotional status. In this effort, the aim is to investigate any potential enhancements in sentiment analysis predictions through visual instances by integrating textual data as additional knowledge reflecting the contextual information of the images. Thus, two separate models have been developed as image-processing and text-processing models in which both models were trained on distinct datasets comprising the same five human emotions. Following, the outputs of the individual models' last dense layers are combined to construct the hybrid multimodel empowered by visual and textual components. The fundamental focus is to evaluate the performance of the hybrid model in which the textual knowledge is concatenated with visual data. Essentially, the hybrid model achieved nearly a 3% F1-score improvement compared to the plain image classification model utilizing convolutional neural network architecture. In essence, this research underscores the potency of fusing textual context with visual information to refine sentiment analysis predictions. The findings not only emphasize the potential of a multi-modal approach but also spotlight a promising avenue for future advancements in emotion analysis and understanding. en_US
dc.identifier.endpage 8 en_US
dc.identifier.issn 0141-9382
dc.identifier.issn 1872-7387
dc.identifier.startpage 1 en_US
dc.identifier.uri https://doi.org/10.1016/j.displa.2024.102958
dc.identifier.uri https://hdl.handle.net/20.500.12573/2442
dc.identifier.volume 87 en_US
dc.language.iso eng en_US
dc.publisher ELSEVIER en_US
dc.relation.isversionof 10.1016/j.displa.2024.102958 en_US
dc.relation.journal DISPLAYS en_US
dc.relation.publicationcategory Makale - Uluslararası Hakemli Dergi - Kurum Öğretim Elemanı en_US
dc.rights info:eu-repo/semantics/closedAccess en_US
dc.subject Sentiment analysis en_US
dc.subject Hybrid model en_US
dc.subject Image & text processing en_US
dc.subject Deep learning en_US
dc.title Beyond visual cues: Emotion recognition in images with text-aware fusion☆ en_US
dc.type article en_US

Files

Original bundle

Now showing 1 - 1 of 1
No Thumbnail Available
Name:
1-s2.0-S0141938224003226-main.pdf
Size:
1.26 MB
Format:
Adobe Portable Document Format
Description:
Makale Dosyası

License bundle

Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
1.44 KB
Format:
Item-specific license agreed upon to submission
Description: