The limitations of irony detection in Dutch social media

Abstract In this paper, we explore the feasibility of irony detection in Dutch social media. To this end, we investigate both transformer models with embedding representations, as well as traditional machine learning classifiers with extensive feature sets. Our feature-based methodology implements a variety of information sources including lexical, semantic, syntactic, sentiment features, as well as two new data-driven features to model common sense. Based on patterns in the syntactic structure of tweets, we aim to model the presence of contrasting sentiments, a phenomenon that is known to be... Mehr ...

Verfasser: Maladry, Aaron
Lefever, Els
Van Hee, Cynthia
Hoste, Véronique
Dokumenttyp: Artikel
Erscheinungsdatum: 2023
Reihe/Periodikum: Language Resources and Evaluation ; ISSN 1574-020X 1574-0218
Verlag/Hrsg.: Springer Science and Business Media LLC
Schlagwörter: Library and Information Sciences / Linguistics and Language / Education / Language and Linguistics
Sprache: Englisch
Permalink: https://search.fid-benelux.de/Record/base-26677474
Datenquelle: BASE; Originalkatalog
Powered By: BASE
Link(s) : http://dx.doi.org/10.1007/s10579-023-09656-1

Abstract In this paper, we explore the feasibility of irony detection in Dutch social media. To this end, we investigate both transformer models with embedding representations, as well as traditional machine learning classifiers with extensive feature sets. Our feature-based methodology implements a variety of information sources including lexical, semantic, syntactic, sentiment features, as well as two new data-driven features to model common sense. Based on patterns in the syntactic structure of tweets, we aim to model the presence of contrasting sentiments, a phenomenon that is known to be indicative of verbal irony and sarcasm. Feature selection, as well as voting ensemble techniques were implemented to enhance the classification performance. The final systems reach F1-scores up to 0.79, which are promising results for a task as difficult as irony detection. Besides a quantitative analysis, this paper also describes a thorough qualitative analysis of the system output. Although lexical cues appear to be very important to express irony, our analysis also revealed the need for more advanced modeling of common-sense knowledge to detect more subtle examples of irony.