Publikation:

Terms over LOAD : Leveraging Named Entities for Cross-Document Extraction and Summarization of Events

Lade...
Vorschaubild

Dateien

Zu diesem Dokument gibt es keine Dateien.

Datum

2016

Autor:innen

Gertz, Michael

Herausgeber:innen

Kontakt

ISSN der Zeitschrift

Electronic ISSN

ISBN

Bibliografische Daten

Verlag

Schriftenreihe

Auflagebezeichnung

URI (zitierfähiger Link)
ArXiv-ID

Internationale Patentnummer

Angaben zur Forschungsförderung

Projekt

Open Access-Veröffentlichung
Core Facility der Universität Konstanz

Gesperrt bis

Titel in einer weiteren Sprache

Publikationstyp
Beitrag zu einem Konferenzband
Publikationsstatus
Published

Erschienen in

SIGIR '16 : Proceedings of the 39th International ACM SIGIR conference on Research and Development in Information Retrieval. New York, NY: ACM, 2016, pp. 503-512. ISBN 978-1-4503-4069-4. Available under: doi: 10.1145/2911451.2911529

Zusammenfassung

Real world events, such as historic incidents, typically contain both spatial and temporal aspects and involve a specific group of persons. This is reflected in the descriptions of events in textual sources, which contain mentions of named entities and dates. Given a large collection of documents, however, such descriptions may be incomplete in a single document, or spread across multiple documents. In these cases, it is beneficial to leverage partial information about the entities that are involved in an event to extract missing information. In this paper, we introduce the LOAD model for cross-document event extraction in large-scale document collections. The graph-based model relies on co-occurrences of named entities belonging to the classes locations, organizations, actors, and dates and puts them in the context of surrounding terms. As such, the model allows for efficient queries and can be updated incrementally in negligible time to reflect changes to the underlying document collection. We discuss the versatility of this approach for event summarization, the completion of partial event information, and the extraction of descriptions for named entities and dates. We create and provide a LOAD graph for the documents in the English Wikipedia from named entities extracted by state-of-the-art NER tools. Based on an evaluation set of historic data that include summaries of diverse events, we evaluate the resulting graph. We find that the model not only allows for near real-time retrieval of information from the underlying document collection, but also provides a comprehensive framework for browsing and summarizing event data.

Zusammenfassung in einer weiteren Sprache

Fachgebiet (DDC)
004 Informatik

Schlagwörter

Event extraction; event representation; document indexing; named entities; entity linking; summarization; ranking

Konferenz

SIGIR '16 : 39th International ACM SIGIR conference on Research and Development in Information Retrieval, 17. Juli 2016 - 21. Juli 2016, Pisa, Italy
Rezension
undefined / . - undefined, undefined

Forschungsvorhaben

Organisationseinheiten

Zeitschriftenheft

Zugehörige Datensätze in KOPS

Zitieren

ISO 690SPITZ, Andreas, Michael GERTZ, 2016. Terms over LOAD : Leveraging Named Entities for Cross-Document Extraction and Summarization of Events. SIGIR '16 : 39th International ACM SIGIR conference on Research and Development in Information Retrieval. Pisa, Italy, 17. Juli 2016 - 21. Juli 2016. In: SIGIR '16 : Proceedings of the 39th International ACM SIGIR conference on Research and Development in Information Retrieval. New York, NY: ACM, 2016, pp. 503-512. ISBN 978-1-4503-4069-4. Available under: doi: 10.1145/2911451.2911529
BibTex
@inproceedings{Spitz2016Terms-55836,
  year={2016},
  doi={10.1145/2911451.2911529},
  title={Terms over LOAD : Leveraging Named Entities for Cross-Document Extraction and Summarization of Events},
  isbn={978-1-4503-4069-4},
  publisher={ACM},
  address={New York, NY},
  booktitle={SIGIR '16 : Proceedings of the 39th International ACM SIGIR conference on Research and Development in Information Retrieval},
  pages={503--512},
  author={Spitz, Andreas and Gertz, Michael}
}
RDF
<rdf:RDF
    xmlns:dcterms="http://purl.org/dc/terms/"
    xmlns:dc="http://purl.org/dc/elements/1.1/"
    xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
    xmlns:bibo="http://purl.org/ontology/bibo/"
    xmlns:dspace="http://digital-repositories.org/ontologies/dspace/0.1.0#"
    xmlns:foaf="http://xmlns.com/foaf/0.1/"
    xmlns:void="http://rdfs.org/ns/void#"
    xmlns:xsd="http://www.w3.org/2001/XMLSchema#" > 
  <rdf:Description rdf:about="https://kops.uni-konstanz.de/server/rdf/resource/123456789/55836">
    <dc:creator>Spitz, Andreas</dc:creator>
    <void:sparqlEndpoint rdf:resource="http://localhost/fuseki/dspace/sparql"/>
    <dcterms:isPartOf rdf:resource="https://kops.uni-konstanz.de/server/rdf/resource/123456789/36"/>
    <dc:rights>terms-of-use</dc:rights>
    <dcterms:title>Terms over LOAD : Leveraging Named Entities for Cross-Document Extraction and Summarization of Events</dcterms:title>
    <dcterms:rights rdf:resource="https://rightsstatements.org/page/InC/1.0/"/>
    <bibo:uri rdf:resource="https://kops.uni-konstanz.de/handle/123456789/55836"/>
    <dspace:isPartOfCollection rdf:resource="https://kops.uni-konstanz.de/server/rdf/resource/123456789/36"/>
    <dcterms:abstract xml:lang="eng">Real world events, such as historic incidents, typically contain both spatial and temporal aspects and involve a specific group of persons. This is reflected in the descriptions of events in textual sources, which contain mentions of named entities and dates. Given a large collection of documents, however, such descriptions may be incomplete in a single document, or spread across multiple documents. In these cases, it is beneficial to leverage partial information about the entities that are involved in an event to extract missing information. In this paper, we introduce the LOAD model for cross-document event extraction in large-scale document collections. The graph-based model relies on co-occurrences of named entities belonging to the classes locations, organizations, actors, and dates and puts them in the context of surrounding terms. As such, the model allows for efficient queries and can be updated incrementally in negligible time to reflect changes to the underlying document collection. We discuss the versatility of this approach for event summarization, the completion of partial event information, and the extraction of descriptions for named entities and dates. We create and provide a LOAD graph for the documents in the English Wikipedia from named entities extracted by state-of-the-art NER tools. Based on an evaluation set of historic data that include summaries of diverse events, we evaluate the resulting graph. We find that the model not only allows for near real-time retrieval of information from the underlying document collection, but also provides a comprehensive framework for browsing and summarizing event data.</dcterms:abstract>
    <dcterms:issued>2016</dcterms:issued>
    <foaf:homepage rdf:resource="http://localhost:8080/"/>
    <dc:creator>Gertz, Michael</dc:creator>
    <dc:date rdf:datatype="http://www.w3.org/2001/XMLSchema#dateTime">2021-12-10T08:54:10Z</dc:date>
    <dcterms:available rdf:datatype="http://www.w3.org/2001/XMLSchema#dateTime">2021-12-10T08:54:10Z</dcterms:available>
    <dc:contributor>Spitz, Andreas</dc:contributor>
    <dc:contributor>Gertz, Michael</dc:contributor>
    <dc:language>eng</dc:language>
  </rdf:Description>
</rdf:RDF>

Interner Vermerk

xmlui.Submission.submit.DescribeStep.inputForms.label.kops_note_fromSubmitter

Kontakt
URL der Originalveröffentl.

Prüfdatum der URL

Prüfungsdatum der Dissertation

Finanzierungsart

Kommentar zur Publikation

Allianzlizenz
Corresponding Authors der Uni Konstanz vorhanden
Internationale Co-Autor:innen
Universitätsbibliographie
Nein
Begutachtet
Diese Publikation teilen