Publikation: Terms over LOAD : Leveraging Named Entities for Cross-Document Extraction and Summarization of Events
Dateien
Datum
Autor:innen
Herausgeber:innen
ISSN der Zeitschrift
Electronic ISSN
ISBN
Bibliografische Daten
Verlag
Schriftenreihe
Auflagebezeichnung
DOI (zitierfähiger Link)
Internationale Patentnummer
Angaben zur Forschungsförderung
Projekt
Open Access-Veröffentlichung
Core Facility der Universität Konstanz
Titel in einer weiteren Sprache
Publikationstyp
Publikationsstatus
Erschienen in
Zusammenfassung
Real world events, such as historic incidents, typically contain both spatial and temporal aspects and involve a specific group of persons. This is reflected in the descriptions of events in textual sources, which contain mentions of named entities and dates. Given a large collection of documents, however, such descriptions may be incomplete in a single document, or spread across multiple documents. In these cases, it is beneficial to leverage partial information about the entities that are involved in an event to extract missing information. In this paper, we introduce the LOAD model for cross-document event extraction in large-scale document collections. The graph-based model relies on co-occurrences of named entities belonging to the classes locations, organizations, actors, and dates and puts them in the context of surrounding terms. As such, the model allows for efficient queries and can be updated incrementally in negligible time to reflect changes to the underlying document collection. We discuss the versatility of this approach for event summarization, the completion of partial event information, and the extraction of descriptions for named entities and dates. We create and provide a LOAD graph for the documents in the English Wikipedia from named entities extracted by state-of-the-art NER tools. Based on an evaluation set of historic data that include summaries of diverse events, we evaluate the resulting graph. We find that the model not only allows for near real-time retrieval of information from the underlying document collection, but also provides a comprehensive framework for browsing and summarizing event data.
Zusammenfassung in einer weiteren Sprache
Fachgebiet (DDC)
Schlagwörter
Konferenz
Rezension
Zitieren
ISO 690
SPITZ, Andreas, Michael GERTZ, 2016. Terms over LOAD : Leveraging Named Entities for Cross-Document Extraction and Summarization of Events. SIGIR '16 : 39th International ACM SIGIR conference on Research and Development in Information Retrieval. Pisa, Italy, 17. Juli 2016 - 21. Juli 2016. In: SIGIR '16 : Proceedings of the 39th International ACM SIGIR conference on Research and Development in Information Retrieval. New York, NY: ACM, 2016, pp. 503-512. ISBN 978-1-4503-4069-4. Available under: doi: 10.1145/2911451.2911529BibTex
@inproceedings{Spitz2016Terms-55836, year={2016}, doi={10.1145/2911451.2911529}, title={Terms over LOAD : Leveraging Named Entities for Cross-Document Extraction and Summarization of Events}, isbn={978-1-4503-4069-4}, publisher={ACM}, address={New York, NY}, booktitle={SIGIR '16 : Proceedings of the 39th International ACM SIGIR conference on Research and Development in Information Retrieval}, pages={503--512}, author={Spitz, Andreas and Gertz, Michael} }
RDF
<rdf:RDF xmlns:dcterms="http://purl.org/dc/terms/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:bibo="http://purl.org/ontology/bibo/" xmlns:dspace="http://digital-repositories.org/ontologies/dspace/0.1.0#" xmlns:foaf="http://xmlns.com/foaf/0.1/" xmlns:void="http://rdfs.org/ns/void#" xmlns:xsd="http://www.w3.org/2001/XMLSchema#" > <rdf:Description rdf:about="https://kops.uni-konstanz.de/server/rdf/resource/123456789/55836"> <dc:creator>Spitz, Andreas</dc:creator> <void:sparqlEndpoint rdf:resource="http://localhost/fuseki/dspace/sparql"/> <dcterms:isPartOf rdf:resource="https://kops.uni-konstanz.de/server/rdf/resource/123456789/36"/> <dc:rights>terms-of-use</dc:rights> <dcterms:title>Terms over LOAD : Leveraging Named Entities for Cross-Document Extraction and Summarization of Events</dcterms:title> <dcterms:rights rdf:resource="https://rightsstatements.org/page/InC/1.0/"/> <bibo:uri rdf:resource="https://kops.uni-konstanz.de/handle/123456789/55836"/> <dspace:isPartOfCollection rdf:resource="https://kops.uni-konstanz.de/server/rdf/resource/123456789/36"/> <dcterms:abstract xml:lang="eng">Real world events, such as historic incidents, typically contain both spatial and temporal aspects and involve a specific group of persons. This is reflected in the descriptions of events in textual sources, which contain mentions of named entities and dates. Given a large collection of documents, however, such descriptions may be incomplete in a single document, or spread across multiple documents. In these cases, it is beneficial to leverage partial information about the entities that are involved in an event to extract missing information. In this paper, we introduce the LOAD model for cross-document event extraction in large-scale document collections. The graph-based model relies on co-occurrences of named entities belonging to the classes locations, organizations, actors, and dates and puts them in the context of surrounding terms. As such, the model allows for efficient queries and can be updated incrementally in negligible time to reflect changes to the underlying document collection. We discuss the versatility of this approach for event summarization, the completion of partial event information, and the extraction of descriptions for named entities and dates. We create and provide a LOAD graph for the documents in the English Wikipedia from named entities extracted by state-of-the-art NER tools. Based on an evaluation set of historic data that include summaries of diverse events, we evaluate the resulting graph. We find that the model not only allows for near real-time retrieval of information from the underlying document collection, but also provides a comprehensive framework for browsing and summarizing event data.</dcterms:abstract> <dcterms:issued>2016</dcterms:issued> <foaf:homepage rdf:resource="http://localhost:8080/"/> <dc:creator>Gertz, Michael</dc:creator> <dc:date rdf:datatype="http://www.w3.org/2001/XMLSchema#dateTime">2021-12-10T08:54:10Z</dc:date> <dcterms:available rdf:datatype="http://www.w3.org/2001/XMLSchema#dateTime">2021-12-10T08:54:10Z</dcterms:available> <dc:contributor>Spitz, Andreas</dc:contributor> <dc:contributor>Gertz, Michael</dc:contributor> <dc:language>eng</dc:language> </rdf:Description> </rdf:RDF>