Publikation:

LEIA : Linguistic Embeddings for the Identification of Affect

Lade...
Vorschaubild

Dateien

Aroyehun_2-q2g8w6w3lt031.pdf
Aroyehun_2-q2g8w6w3lt031.pdfGröße: 2.04 MBDownloads: 18

Datum

2023

Autor:innen

Malik, Lukas
Metzler, Hannah
Haimerl, Nikolas
Di Natale, Anna

Herausgeber:innen

Kontakt

ISSN der Zeitschrift

Electronic ISSN

ISBN

Bibliografische Daten

Verlag

Schriftenreihe

Auflagebezeichnung

ArXiv-ID

Internationale Patentnummer

Link zur Lizenz

Angaben zur Forschungsförderung

European Union (EU): 101020961
European Union (EU): 101020961
European Union (EU): 101020961

Projekt

Open Access-Veröffentlichung
Open Access Gold
Core Facility der Universität Konstanz

Gesperrt bis

Titel in einer weiteren Sprache

Publikationstyp
Zeitschriftenartikel
Publikationsstatus
Published

Erschienen in

EPJ Data Science. Springer. 2023, 12, 52. eISSN 2193-1127. Available under: doi: 10.1140/epjds/s13688-023-00427-0

Zusammenfassung

The wealth of text data generated by social media has enabled new kinds of analysis of emotions with language models. These models are often trained on small and costly datasets of text annotations produced by readers who guess the emotions expressed by others in social media posts. This affects the quality of emotion identification methods due to training data size limitations and noise in the production of labels used in model development. We present LEIA, a model for emotion identification in text that has been trained on a dataset of more than 6 million posts with self-annotated emotion labels for happiness, affection, sadness, anger, and fear. LEIA is based on a word masking method that enhances the learning of emotion words during model pre-training. LEIA achieves macro-F1 values of approximately 73 on three in-domain test datasets, outperforming other supervised and unsupervised methods in a strong benchmark that shows that LEIA generalizes across posts, users, and time periods. We further perform an out-of-domain evaluation on five different datasets of social media and other sources, showing LEIA’s robust performance across media, data collection methods, and annotation schemes. Our results show that LEIA generalizes its classification of anger, happiness, and sadness beyond the domain it was trained on. LEIA can be applied in future research to provide better identification of emotions in text from the perspective of the writer.

Zusammenfassung in einer weiteren Sprache

Fachgebiet (DDC)
320 Politik

Schlagwörter

Konferenz

Rezension
undefined / . - undefined, undefined

Forschungsvorhaben

Organisationseinheiten

Zeitschriftenheft

Zugehörige Datensätze in KOPS

Zitieren

ISO 690AROYEHUN, Segun Toafeek, Lukas MALIK, Hannah METZLER, Nikolas HAIMERL, Anna DI NATALE, David GARCIA, 2023. LEIA : Linguistic Embeddings for the Identification of Affect. In: EPJ Data Science. Springer. 2023, 12, 52. eISSN 2193-1127. Available under: doi: 10.1140/epjds/s13688-023-00427-0
BibTex
@article{Aroyehun2023Lingu-68948,
  year={2023},
  doi={10.1140/epjds/s13688-023-00427-0},
  title={LEIA : Linguistic Embeddings for the Identification of Affect},
  volume={12},
  journal={EPJ Data Science},
  author={Aroyehun, Segun Toafeek and Malik, Lukas and Metzler, Hannah and Haimerl, Nikolas and Di Natale, Anna and Garcia, David},
  note={Article Number: 52}
}
RDF
<rdf:RDF
    xmlns:dcterms="http://purl.org/dc/terms/"
    xmlns:dc="http://purl.org/dc/elements/1.1/"
    xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
    xmlns:bibo="http://purl.org/ontology/bibo/"
    xmlns:dspace="http://digital-repositories.org/ontologies/dspace/0.1.0#"
    xmlns:foaf="http://xmlns.com/foaf/0.1/"
    xmlns:void="http://rdfs.org/ns/void#"
    xmlns:xsd="http://www.w3.org/2001/XMLSchema#" > 
  <rdf:Description rdf:about="https://kops.uni-konstanz.de/server/rdf/resource/123456789/68948">
    <void:sparqlEndpoint rdf:resource="http://localhost/fuseki/dspace/sparql"/>
    <dcterms:isPartOf rdf:resource="https://kops.uni-konstanz.de/server/rdf/resource/123456789/42"/>
    <dc:contributor>Di Natale, Anna</dc:contributor>
    <dc:creator>Aroyehun, Segun Toafeek</dc:creator>
    <foaf:homepage rdf:resource="http://localhost:8080/"/>
    <dcterms:abstract>The wealth of text data generated by social media has enabled new kinds of analysis of emotions with language models. These models are often trained on small and costly datasets of text annotations produced by readers who guess the emotions expressed by others in social media posts. This affects the quality of emotion identification methods due to training data size limitations and noise in the production of labels used in model development. We present LEIA, a model for emotion identification in text that has been trained on a dataset of more than 6 million posts with self-annotated emotion labels for happiness, affection, sadness, anger, and fear. LEIA is based on a word masking method that enhances the learning of emotion words during model pre-training. LEIA achieves macro-F1 values of approximately 73 on three in-domain test datasets, outperforming other supervised and unsupervised methods in a strong benchmark that shows that LEIA generalizes across posts, users, and time periods. We further perform an out-of-domain evaluation on five different datasets of social media and other sources, showing LEIA’s robust performance across media, data collection methods, and annotation schemes. Our results show that LEIA generalizes its classification of anger, happiness, and sadness beyond the domain it was trained on. LEIA can be applied in future research to provide better identification of emotions in text from the perspective of the writer.</dcterms:abstract>
    <dc:contributor>Aroyehun, Segun Toafeek</dc:contributor>
    <dc:creator>Haimerl, Nikolas</dc:creator>
    <dc:creator>Garcia, David</dc:creator>
    <dc:contributor>Metzler, Hannah</dc:contributor>
    <dc:language>eng</dc:language>
    <dspace:hasBitstream rdf:resource="https://kops.uni-konstanz.de/bitstream/123456789/68948/1/Aroyehun_2-q2g8w6w3lt031.pdf"/>
    <dcterms:title>LEIA : Linguistic Embeddings for the Identification of Affect</dcterms:title>
    <dc:rights>Attribution 4.0 International</dc:rights>
    <dc:creator>Di Natale, Anna</dc:creator>
    <dcterms:rights rdf:resource="http://creativecommons.org/licenses/by/4.0/"/>
    <dc:date rdf:datatype="http://www.w3.org/2001/XMLSchema#dateTime">2024-01-05T10:25:42Z</dc:date>
    <dc:contributor>Garcia, David</dc:contributor>
    <dspace:isPartOfCollection rdf:resource="https://kops.uni-konstanz.de/server/rdf/resource/123456789/42"/>
    <dc:contributor>Malik, Lukas</dc:contributor>
    <dc:creator>Malik, Lukas</dc:creator>
    <bibo:uri rdf:resource="https://kops.uni-konstanz.de/handle/123456789/68948"/>
    <dcterms:issued>2023</dcterms:issued>
    <dcterms:hasPart rdf:resource="https://kops.uni-konstanz.de/bitstream/123456789/68948/1/Aroyehun_2-q2g8w6w3lt031.pdf"/>
    <dc:creator>Metzler, Hannah</dc:creator>
    <dc:contributor>Haimerl, Nikolas</dc:contributor>
    <dcterms:available rdf:datatype="http://www.w3.org/2001/XMLSchema#dateTime">2024-01-05T10:25:42Z</dcterms:available>
  </rdf:Description>
</rdf:RDF>

Interner Vermerk

xmlui.Submission.submit.DescribeStep.inputForms.label.kops_note_fromSubmitter

Kontakt
URL der Originalveröffentl.

Prüfdatum der URL

Prüfungsdatum der Dissertation

Finanzierungsart

Kommentar zur Publikation

Allianzlizenz
Corresponding Authors der Uni Konstanz vorhanden
Internationale Co-Autor:innen
Universitätsbibliographie
Ja
Begutachtet
Ja
Diese Publikation teilen