Entity-Centric Topic Extraction and Exploration : A Network-Based Approach
Dateien
Datum
Autor:innen
Herausgeber:innen
ISSN der Zeitschrift
Electronic ISSN
ISBN
Bibliografische Daten
Verlag
Schriftenreihe
Auflagebezeichnung
DOI (zitierfähiger Link)
Internationale Patentnummer
Angaben zur Forschungsförderung
Projekt
Open Access-Veröffentlichung
Sammlungen
Core Facility der Universität Konstanz
Titel in einer weiteren Sprache
Publikationstyp
Publikationsstatus
Erschienen in
Zusammenfassung
Topic modeling is an important tool in the analysis of corpora and the classification and clustering of documents. Various extensions of the underlying graphical models have been proposed to address hierarchical or dynamical topics. However, despite their popularity, topic models face problems in the exploration and correlation of the (often unknown number of) topics extracted from a document collection, and rely on compute-intensive graphical models. In this paper, we present a novel framework for exploring evolving corpora of news articles in terms of topics covered over time. Our approach is based on implicit networks representing the cooccurrences of entities and terms in the documents as weighted edges. Edges with high weight between entities are indicative of topics, allowing the context of a topic to be explored incrementally by growing network sub-structures. Since the exploration of topics corresponds to local operations in the network, it is efficient and interactive. Adding new news articles to the collection simply updates the network, thus avoiding expensive recomputations of term and topic distributions.
Zusammenfassung in einer weiteren Sprache
Fachgebiet (DDC)
Schlagwörter
Konferenz
Rezension
Zitieren
ISO 690
SPITZ, Andreas, Michael GERTZ, 2018. Entity-Centric Topic Extraction and Exploration : A Network-Based Approach. European Conference on Information Retrieval : ECIR 2018. Grenoble, France, 26. März 2018 - 29. März 2018. In: PASI, Gabriella, ed., Benjamin PIWOWARSKI, ed., Leif AZZOPARDI, ed., Allan HANBURY, ed.. Advances in Information Retrieval : 40th European Conference on IR Research, ECIR 2018, Proceedings. Cham: Springer, 2018, pp. 3-15. Lecture Notes in Computer Science. 10772. ISBN 978-3-319-76940-0. Available under: doi: 10.1007/978-3-319-76941-7_1BibTex
@inproceedings{Spitz2018Entit-55797, year={2018}, doi={10.1007/978-3-319-76941-7_1}, title={Entity-Centric Topic Extraction and Exploration : A Network-Based Approach}, number={10772}, isbn={978-3-319-76940-0}, publisher={Springer}, address={Cham}, series={Lecture Notes in Computer Science}, booktitle={Advances in Information Retrieval : 40th European Conference on IR Research, ECIR 2018, Proceedings}, pages={3--15}, editor={Pasi, Gabriella and Piwowarski, Benjamin and Azzopardi, Leif and Hanbury, Allan}, author={Spitz, Andreas and Gertz, Michael} }
RDF
<rdf:RDF xmlns:dcterms="http://purl.org/dc/terms/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:bibo="http://purl.org/ontology/bibo/" xmlns:dspace="http://digital-repositories.org/ontologies/dspace/0.1.0#" xmlns:foaf="http://xmlns.com/foaf/0.1/" xmlns:void="http://rdfs.org/ns/void#" xmlns:xsd="http://www.w3.org/2001/XMLSchema#" > <rdf:Description rdf:about="https://kops.uni-konstanz.de/server/rdf/resource/123456789/55797"> <foaf:homepage rdf:resource="http://localhost:8080/"/> <dspace:isPartOfCollection rdf:resource="https://kops.uni-konstanz.de/server/rdf/resource/123456789/36"/> <dcterms:abstract xml:lang="eng">Topic modeling is an important tool in the analysis of corpora and the classification and clustering of documents. Various extensions of the underlying graphical models have been proposed to address hierarchical or dynamical topics. However, despite their popularity, topic models face problems in the exploration and correlation of the (often unknown number of) topics extracted from a document collection, and rely on compute-intensive graphical models. In this paper, we present a novel framework for exploring evolving corpora of news articles in terms of topics covered over time. Our approach is based on implicit networks representing the cooccurrences of entities and terms in the documents as weighted edges. Edges with high weight between entities are indicative of topics, allowing the context of a topic to be explored incrementally by growing network sub-structures. Since the exploration of topics corresponds to local operations in the network, it is efficient and interactive. Adding new news articles to the collection simply updates the network, thus avoiding expensive recomputations of term and topic distributions.</dcterms:abstract> <dc:contributor>Spitz, Andreas</dc:contributor> <dc:creator>Spitz, Andreas</dc:creator> <dcterms:rights rdf:resource="https://rightsstatements.org/page/InC/1.0/"/> <dc:rights>terms-of-use</dc:rights> <dc:contributor>Gertz, Michael</dc:contributor> <dc:language>eng</dc:language> <void:sparqlEndpoint rdf:resource="http://localhost/fuseki/dspace/sparql"/> <dcterms:available rdf:datatype="http://www.w3.org/2001/XMLSchema#dateTime">2021-12-08T12:33:24Z</dcterms:available> <dcterms:title>Entity-Centric Topic Extraction and Exploration : A Network-Based Approach</dcterms:title> <dc:date rdf:datatype="http://www.w3.org/2001/XMLSchema#dateTime">2021-12-08T12:33:24Z</dc:date> <dc:creator>Gertz, Michael</dc:creator> <dcterms:isPartOf rdf:resource="https://kops.uni-konstanz.de/server/rdf/resource/123456789/36"/> <dcterms:issued>2018</dcterms:issued> <bibo:uri rdf:resource="https://kops.uni-konstanz.de/handle/123456789/55797"/> </rdf:Description> </rdf:RDF>