Entity-Centric Topic Extraction and Exploration : A Network-Based Approach

Lade...
Vorschaubild
Dateien
Zu diesem Dokument gibt es keine Dateien.
Datum
2018
Autor:innen
Gertz, Michael
Herausgeber:innen
Kontakt
ISSN der Zeitschrift
Electronic ISSN
ISBN
Bibliografische Daten
Verlag
Schriftenreihe
Auflagebezeichnung
URI (zitierfähiger Link)
ArXiv-ID
Internationale Patentnummer
Angaben zur Forschungsförderung
Projekt
Open Access-Veröffentlichung
Core Facility der Universität Konstanz
Gesperrt bis
Titel in einer weiteren Sprache
Publikationstyp
Beitrag zu einem Konferenzband
Publikationsstatus
Published
Erschienen in
PASI, Gabriella, ed., Benjamin PIWOWARSKI, ed., Leif AZZOPARDI, ed., Allan HANBURY, ed.. Advances in Information Retrieval : 40th European Conference on IR Research, ECIR 2018, Proceedings. Cham: Springer, 2018, pp. 3-15. Lecture Notes in Computer Science. 10772. ISBN 978-3-319-76940-0. Available under: doi: 10.1007/978-3-319-76941-7_1
Zusammenfassung

Topic modeling is an important tool in the analysis of corpora and the classification and clustering of documents. Various extensions of the underlying graphical models have been proposed to address hierarchical or dynamical topics. However, despite their popularity, topic models face problems in the exploration and correlation of the (often unknown number of) topics extracted from a document collection, and rely on compute-intensive graphical models. In this paper, we present a novel framework for exploring evolving corpora of news articles in terms of topics covered over time. Our approach is based on implicit networks representing the cooccurrences of entities and terms in the documents as weighted edges. Edges with high weight between entities are indicative of topics, allowing the context of a topic to be explored incrementally by growing network sub-structures. Since the exploration of topics corresponds to local operations in the network, it is efficient and interactive. Adding new news articles to the collection simply updates the network, thus avoiding expensive recomputations of term and topic distributions.

Zusammenfassung in einer weiteren Sprache
Fachgebiet (DDC)
004 Informatik
Schlagwörter
Networks, Topic models, Evolving networks
Konferenz
European Conference on Information Retrieval : ECIR 2018, 26. März 2018 - 29. März 2018, Grenoble, France
Rezension
undefined / . - undefined, undefined
Forschungsvorhaben
Organisationseinheiten
Zeitschriftenheft
Datensätze
Zitieren
ISO 690SPITZ, Andreas, Michael GERTZ, 2018. Entity-Centric Topic Extraction and Exploration : A Network-Based Approach. European Conference on Information Retrieval : ECIR 2018. Grenoble, France, 26. März 2018 - 29. März 2018. In: PASI, Gabriella, ed., Benjamin PIWOWARSKI, ed., Leif AZZOPARDI, ed., Allan HANBURY, ed.. Advances in Information Retrieval : 40th European Conference on IR Research, ECIR 2018, Proceedings. Cham: Springer, 2018, pp. 3-15. Lecture Notes in Computer Science. 10772. ISBN 978-3-319-76940-0. Available under: doi: 10.1007/978-3-319-76941-7_1
BibTex
@inproceedings{Spitz2018Entit-55797,
  year={2018},
  doi={10.1007/978-3-319-76941-7_1},
  title={Entity-Centric Topic Extraction and Exploration : A Network-Based Approach},
  number={10772},
  isbn={978-3-319-76940-0},
  publisher={Springer},
  address={Cham},
  series={Lecture Notes in Computer Science},
  booktitle={Advances in Information Retrieval : 40th European Conference on IR Research, ECIR 2018, Proceedings},
  pages={3--15},
  editor={Pasi, Gabriella and Piwowarski, Benjamin and Azzopardi, Leif and Hanbury, Allan},
  author={Spitz, Andreas and Gertz, Michael}
}
RDF
<rdf:RDF
    xmlns:dcterms="http://purl.org/dc/terms/"
    xmlns:dc="http://purl.org/dc/elements/1.1/"
    xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
    xmlns:bibo="http://purl.org/ontology/bibo/"
    xmlns:dspace="http://digital-repositories.org/ontologies/dspace/0.1.0#"
    xmlns:foaf="http://xmlns.com/foaf/0.1/"
    xmlns:void="http://rdfs.org/ns/void#"
    xmlns:xsd="http://www.w3.org/2001/XMLSchema#" > 
  <rdf:Description rdf:about="https://kops.uni-konstanz.de/server/rdf/resource/123456789/55797">
    <foaf:homepage rdf:resource="http://localhost:8080/"/>
    <dspace:isPartOfCollection rdf:resource="https://kops.uni-konstanz.de/server/rdf/resource/123456789/36"/>
    <dcterms:abstract xml:lang="eng">Topic modeling is an important tool in the analysis of corpora and the classification and clustering of documents. Various extensions of the underlying graphical models have been proposed to address hierarchical or dynamical topics. However, despite their popularity, topic models face problems in the exploration and correlation of the (often unknown number of) topics extracted from a document collection, and rely on compute-intensive graphical models. In this paper, we present a novel framework for exploring evolving corpora of news articles in terms of topics covered over time. Our approach is based on implicit networks representing the cooccurrences of entities and terms in the documents as weighted edges. Edges with high weight between entities are indicative of topics, allowing the context of a topic to be explored incrementally by growing network sub-structures. Since the exploration of topics corresponds to local operations in the network, it is efficient and interactive. Adding new news articles to the collection simply updates the network, thus avoiding expensive recomputations of term and topic distributions.</dcterms:abstract>
    <dc:contributor>Spitz, Andreas</dc:contributor>
    <dc:creator>Spitz, Andreas</dc:creator>
    <dcterms:rights rdf:resource="https://rightsstatements.org/page/InC/1.0/"/>
    <dc:rights>terms-of-use</dc:rights>
    <dc:contributor>Gertz, Michael</dc:contributor>
    <dc:language>eng</dc:language>
    <void:sparqlEndpoint rdf:resource="http://localhost/fuseki/dspace/sparql"/>
    <dcterms:available rdf:datatype="http://www.w3.org/2001/XMLSchema#dateTime">2021-12-08T12:33:24Z</dcterms:available>
    <dcterms:title>Entity-Centric Topic Extraction and Exploration : A Network-Based Approach</dcterms:title>
    <dc:date rdf:datatype="http://www.w3.org/2001/XMLSchema#dateTime">2021-12-08T12:33:24Z</dc:date>
    <dc:creator>Gertz, Michael</dc:creator>
    <dcterms:isPartOf rdf:resource="https://kops.uni-konstanz.de/server/rdf/resource/123456789/36"/>
    <dcterms:issued>2018</dcterms:issued>
    <bibo:uri rdf:resource="https://kops.uni-konstanz.de/handle/123456789/55797"/>
  </rdf:Description>
</rdf:RDF>
Interner Vermerk
xmlui.Submission.submit.DescribeStep.inputForms.label.kops_note_fromSubmitter
Kontakt
URL der Originalveröffentl.
Prüfdatum der URL
Prüfungsdatum der Dissertation
Finanzierungsart
Kommentar zur Publikation
Allianzlizenz
Corresponding Authors der Uni Konstanz vorhanden
Internationale Co-Autor:innen
Universitätsbibliographie
Nein
Begutachtet
Diese Publikation teilen