Publikation: Strong Heuristics for Named Entity Linking
Dateien
Datum
Autor:innen
Herausgeber:innen
ISSN der Zeitschrift
Electronic ISSN
ISBN
Bibliografische Daten
Verlag
Schriftenreihe
Auflagebezeichnung
DOI (zitierfähiger Link)
Internationale Patentnummer
Angaben zur Forschungsförderung
Projekt
Open Access-Veröffentlichung
Core Facility der Universität Konstanz
Titel in einer weiteren Sprache
Publikationstyp
Publikationsstatus
Erschienen in
Zusammenfassung
Named entity linking (NEL) in news is a challenging endeavour due to the frequency of unseen and emerging entities, which necessitates the use of unsupervised or zero-shot methods. However, such methods tend to come with caveats, such as no integration of suitable knowledge bases (like Wikidata) for emerging entities, a lack of scalability, and poor interpretability. Here, we consider person disambiguation in Quotebank, a massive corpus of speaker-attributed quotations from the news, and investigate the suitability of intuitive, lightweight, and scalable heuristics for NEL in web-scale corpora. Our best performing heuristic disambiguates 94% and 63% of the mentions on Quotebank and the AIDA-CoNLL benchmark, respectively. Additionally, the proposed heuristics compare favourably to the state-of-the-art unsupervised and zero-shot methods, Eigenthemes and mGENRE, respectively, thereby serving as strong baselines for unsupervised and zero-shot entity linking.
Zusammenfassung in einer weiteren Sprache
Fachgebiet (DDC)
Schlagwörter
Konferenz
Rezension
Zitieren
ISO 690
CULJAK, Marko, Andreas SPITZ, Robert WEST, Akhil ARORA, 2022. Strong Heuristics for Named Entity Linking. NAACL 2022 : Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies: Student Research Workshop. Seattle, Washington, 10. Juli 2022 - 15. Juli 2022. In: IPPOLITO, Daphne, ed., Liunian Harold LI, ed., Maria Leonor PACHECO, ed. and others. Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics : Human Language Technologies : Student Research Workshop. Stroudsburg, PA: ACL, 2022, pp. 235-246. Available under: doi: 10.18653/v1/2022.naacl-srw.30BibTex
@inproceedings{Culjak2022Stron-59235, year={2022}, doi={10.18653/v1/2022.naacl-srw.30}, title={Strong Heuristics for Named Entity Linking}, publisher={ACL}, address={Stroudsburg, PA}, booktitle={Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics : Human Language Technologies : Student Research Workshop}, pages={235--246}, editor={Ippolito, Daphne and Li, Liunian Harold and Pacheco, Maria Leonor}, author={Culjak, Marko and Spitz, Andreas and West, Robert and Arora, Akhil} }
RDF
<rdf:RDF xmlns:dcterms="http://purl.org/dc/terms/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:bibo="http://purl.org/ontology/bibo/" xmlns:dspace="http://digital-repositories.org/ontologies/dspace/0.1.0#" xmlns:foaf="http://xmlns.com/foaf/0.1/" xmlns:void="http://rdfs.org/ns/void#" xmlns:xsd="http://www.w3.org/2001/XMLSchema#" > <rdf:Description rdf:about="https://kops.uni-konstanz.de/server/rdf/resource/123456789/59235"> <dcterms:isPartOf rdf:resource="https://kops.uni-konstanz.de/server/rdf/resource/123456789/36"/> <foaf:homepage rdf:resource="http://localhost:8080/"/> <dc:contributor>Culjak, Marko</dc:contributor> <dcterms:issued>2022</dcterms:issued> <dc:creator>West, Robert</dc:creator> <dc:contributor>Spitz, Andreas</dc:contributor> <dc:contributor>Arora, Akhil</dc:contributor> <dc:date rdf:datatype="http://www.w3.org/2001/XMLSchema#dateTime">2022-11-22T09:12:16Z</dc:date> <dcterms:title>Strong Heuristics for Named Entity Linking</dcterms:title> <dc:creator>Spitz, Andreas</dc:creator> <bibo:uri rdf:resource="https://kops.uni-konstanz.de/handle/123456789/59235"/> <dspace:isPartOfCollection rdf:resource="https://kops.uni-konstanz.de/server/rdf/resource/123456789/36"/> <dc:creator>Culjak, Marko</dc:creator> <dcterms:available rdf:datatype="http://www.w3.org/2001/XMLSchema#dateTime">2022-11-22T09:12:16Z</dcterms:available> <dc:language>eng</dc:language> <dcterms:abstract xml:lang="eng">Named entity linking (NEL) in news is a challenging endeavour due to the frequency of unseen and emerging entities, which necessitates the use of unsupervised or zero-shot methods. However, such methods tend to come with caveats, such as no integration of suitable knowledge bases (like Wikidata) for emerging entities, a lack of scalability, and poor interpretability. Here, we consider person disambiguation in Quotebank, a massive corpus of speaker-attributed quotations from the news, and investigate the suitability of intuitive, lightweight, and scalable heuristics for NEL in web-scale corpora. Our best performing heuristic disambiguates 94% and 63% of the mentions on Quotebank and the AIDA-CoNLL benchmark, respectively. Additionally, the proposed heuristics compare favourably to the state-of-the-art unsupervised and zero-shot methods, Eigenthemes and mGENRE, respectively, thereby serving as strong baselines for unsupervised and zero-shot entity linking.</dcterms:abstract> <dc:contributor>West, Robert</dc:contributor> <dc:creator>Arora, Akhil</dc:creator> <void:sparqlEndpoint rdf:resource="http://localhost/fuseki/dspace/sparql"/> </rdf:Description> </rdf:RDF>