Publikation: Linguistic Features in German BERT : The Role of Morphology, Syntax, and Semantics in Multi-Class Text Classification
Dateien
Datum
Autor:innen
Herausgeber:innen
ISSN der Zeitschrift
Electronic ISSN
ISBN
Bibliografische Daten
Verlag
Schriftenreihe
Auflagebezeichnung
Internationale Patentnummer
Angaben zur Forschungsförderung
Projekt
Open Access-Veröffentlichung
Sammlungen
Core Facility der Universität Konstanz
Titel in einer weiteren Sprache
Publikationstyp
Publikationsstatus
Erschienen in
Zusammenfassung
Most studies on the linguistic information encoded by BERT primarily focus on English. Our study examines a monolingual German BERT model using a semantic classification task on newspaper articles, analysing the linguistic features influencing classification decisions through SHAP values. We use the TüBa-D/Z corpus, a resource with gold-standard annotations for a set of linguistic features, including POS, inflectional morphology, phrasal, clausal, and dependency structures. Semantic features of nouns are evaluated via the GermaNet ontology using shared hypernyms. Our results indicate that the features identified in English also affect classification in German but suggests important language- and task-specific features as well.
Zusammenfassung in einer weiteren Sprache
Fachgebiet (DDC)
Schlagwörter
Konferenz
Rezension
Zitieren
ISO 690
BEYER, Henrike, Diego FRASSINELLI, 2025. Linguistic Features in German BERT : The Role of Morphology, Syntax, and Semantics in Multi-Class Text Classification. The 2025 Annual Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics (NAACL 2025). Albuquerque, New Mexico, 29. Apr. 2025 - 4. Mai 2025. In: EBRAHIMI, Abteen, Hrsg., Samar HAIDER, Hrsg., Emmy LIU, Hrsg. und andere. Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics : Human Language Technologies (Volume 4: Student Research Workshop). Kerrville, TX: Association for Computational Linguistics, 2025, S. 28-39. ISBN 979-8-89176-192-6BibTex
@inproceedings{Beyer2025Lingu-73331, title={Linguistic Features in German BERT : The Role of Morphology, Syntax, and Semantics in Multi-Class Text Classification}, url={https://aclanthology.org/2025.naacl-srw.3/}, year={2025}, isbn={979-8-89176-192-6}, address={Kerrville, TX}, publisher={Association for Computational Linguistics}, booktitle={Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics : Human Language Technologies (Volume 4: Student Research Workshop)}, pages={28--39}, editor={Ebrahimi, Abteen and Haider, Samar and Liu, Emmy}, author={Beyer, Henrike and Frassinelli, Diego} }
RDF
<rdf:RDF xmlns:dcterms="http://purl.org/dc/terms/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:bibo="http://purl.org/ontology/bibo/" xmlns:dspace="http://digital-repositories.org/ontologies/dspace/0.1.0#" xmlns:foaf="http://xmlns.com/foaf/0.1/" xmlns:void="http://rdfs.org/ns/void#" xmlns:xsd="http://www.w3.org/2001/XMLSchema#" > <rdf:Description rdf:about="https://kops.uni-konstanz.de/server/rdf/resource/123456789/73331"> <dc:date rdf:datatype="http://www.w3.org/2001/XMLSchema#dateTime">2025-05-14T10:09:22Z</dc:date> <foaf:homepage rdf:resource="http://localhost:8080/"/> <void:sparqlEndpoint rdf:resource="http://localhost/fuseki/dspace/sparql"/> <dcterms:abstract>Most studies on the linguistic information encoded by BERT primarily focus on English. Our study examines a monolingual German BERT model using a semantic classification task on newspaper articles, analysing the linguistic features influencing classification decisions through SHAP values. We use the TüBa-D/Z corpus, a resource with gold-standard annotations for a set of linguistic features, including POS, inflectional morphology, phrasal, clausal, and dependency structures. Semantic features of nouns are evaluated via the GermaNet ontology using shared hypernyms. Our results indicate that the features identified in English also affect classification in German but suggests important language- and task-specific features as well.</dcterms:abstract> <dcterms:available rdf:datatype="http://www.w3.org/2001/XMLSchema#dateTime">2025-05-14T10:09:22Z</dcterms:available> <dc:language>eng</dc:language> <dcterms:title>Linguistic Features in German BERT : The Role of Morphology, Syntax, and Semantics in Multi-Class Text Classification</dcterms:title> <dc:creator>Frassinelli, Diego</dc:creator> <dc:creator>Beyer, Henrike</dc:creator> <dcterms:isPartOf rdf:resource="https://kops.uni-konstanz.de/server/rdf/resource/123456789/45"/> <dspace:isPartOfCollection rdf:resource="https://kops.uni-konstanz.de/server/rdf/resource/123456789/45"/> <dc:contributor>Beyer, Henrike</dc:contributor> <dcterms:issued>2025</dcterms:issued> <dc:contributor>Frassinelli, Diego</dc:contributor> <bibo:uri rdf:resource="https://kops.uni-konstanz.de/handle/123456789/73331"/> </rdf:Description> </rdf:RDF>