A Framework for Sarcasm Detection Incorporating Roman Sindhi and Roman Urdu Scripts in Multilingual Dataset Analysis

Authors

  • Majdah Alvi Department of Computer Systems Engineering, The Islamia University of Bahawalpur, Bahawalpur, 63100, Pakistan.
  • Dr. Muhammad Bux Alvi Department of Computer Systems Engineering, The Islamia University of Bahawalpur, Bahawalpur, 63100, Pakistan. https://orcid.org/0000-0003-1688-9775
  • Noor Fatima Department of Computer Systems Engineering, The Islamia University of Bahawalpur, Bahawalpur, 63100, Pakistan. https://orcid.org/0009-0002-6056-8987

Keywords:

multilingual data, sarcasm detection, sentiment analysis, Roman Sindhi, Roman Urdu

Abstract

Sarcasm detection is imperative for successful real-time sentiment analysis in the pervasive social web. Detection of sarcastic tones expressed through text that impart bitter, satirical, or mockery expressions, remarks, or derision in Natural Language Processing (NLP) is problematic to handle for humans; making it automated is even more arduous. This work aims to propose a sarcasm detection framework tflexihat optimizes a sentiment analysis system by correctly detecting sarcastic text messages for resource-poor languages in multilingual datasets. The techniques developed to date are inadequate and require precise training data. Therefore, we propose neural networks and deep learning-based models that focus on contextual information utilizing different word embedding techniques, and we further propose a framework for multilingual sarcasm detection resources for low-resource languages such as Roman Sindhi and Roman Urdu. With this sarcasm-aware framework, individuals with limited English proficiency will be better equipped to engage on social media using sarcastic tones, emojis, and creative linguistic variations in a multilingual textual data analysis.

Downloads

Published

2025-03-01

How to Cite

Alvi, M., Alvi, M. B., & Fatima, N. (2025). A Framework for Sarcasm Detection Incorporating Roman Sindhi and Roman Urdu Scripts in Multilingual Dataset Analysis. Journal of Computing & Biomedical Informatics, 8(02). Retrieved from https://www.jcbi.org/index.php/Main/article/view/947