Publikation:

Interpretable and Comparative Textual Dataset Exploration Using Near-Identity Mention Relations

Lade...
Vorschaubild

Dateien

Zu diesem Dokument gibt es keine Dateien.

Datum

2020

Herausgeber:innen

Kontakt

ISSN der Zeitschrift

Electronic ISSN

ISBN

Bibliografische Daten

Verlag

Schriftenreihe

Auflagebezeichnung

URI (zitierfähiger Link)
ArXiv-ID

Internationale Patentnummer

Angaben zur Forschungsförderung

Projekt

Open Access-Veröffentlichung
Core Facility der Universität Konstanz

Gesperrt bis

Titel in einer weiteren Sprache

Publikationstyp
Beitrag zu einem Konferenzband
Publikationsstatus
Published

Erschienen in

HUANG, Ruhua, ed. and others. Proceedings of the ACM/IEEE Joint Conference on Digital Libraries in 2020 (JCDL '20). New York: ACM, 2020, pp. 457-458. ISBN 978-1-4503-7585-6. Available under: doi: 10.1145/3383583.3398562

Zusammenfassung

Dataset exploration is a set of techniques crucial in many research and data science projects. For textual datasets, commonly used techniques include topic modeling, document summarization, and methods related to dimension reduction. Despite their robustness, these techniques suffer from at least one of the following drawbacks: document summarization does not explicitly set documents in relation, the others yield summaries or topics that often are difficult to interpret and yield poor results for topics that consist of context-dependent terms. We propose a method for dataset exploration that employs cross-document near-identity resolution of mentions of semantic concepts, such as persons, other named entity types, events, actions. The method not only sets documents in relation and thus allows for comparative dataset exploration, but also yields well interpretable document representations. Additionally, due to the underlying approach for cross-document resolution of concept mentions, the method is able to set documents in relation as to their near-identity terms, e.g., synonyms that are not universally valid but only in the given dataset.

Zusammenfassung in einer weiteren Sprache

Fachgebiet (DDC)
004 Informatik

Schlagwörter

Konferenz

JCDL '20, 1. Aug. 2020 - 5. Aug. 2020, China (Virtual Event)
Rezension
undefined / . - undefined, undefined

Forschungsvorhaben

Organisationseinheiten

Zeitschriftenheft

Zugehörige Datensätze in KOPS

Zitieren

ISO 690ZHUKOVA, Anastasia, Felix HAMBORG, Bela GIPP, 2020. Interpretable and Comparative Textual Dataset Exploration Using Near-Identity Mention Relations. JCDL '20. China (Virtual Event), 1. Aug. 2020 - 5. Aug. 2020. In: HUANG, Ruhua, ed. and others. Proceedings of the ACM/IEEE Joint Conference on Digital Libraries in 2020 (JCDL '20). New York: ACM, 2020, pp. 457-458. ISBN 978-1-4503-7585-6. Available under: doi: 10.1145/3383583.3398562
BibTex
@inproceedings{Zhukova2020Inter-51923,
  year={2020},
  doi={10.1145/3383583.3398562},
  title={Interpretable and Comparative Textual Dataset Exploration Using Near-Identity Mention Relations},
  isbn={978-1-4503-7585-6},
  publisher={ACM},
  address={New York},
  booktitle={Proceedings of the ACM/IEEE Joint Conference on Digital Libraries in 2020 (JCDL '20)},
  pages={457--458},
  editor={Huang, Ruhua},
  author={Zhukova, Anastasia and Hamborg, Felix and Gipp, Bela}
}
RDF
<rdf:RDF
    xmlns:dcterms="http://purl.org/dc/terms/"
    xmlns:dc="http://purl.org/dc/elements/1.1/"
    xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
    xmlns:bibo="http://purl.org/ontology/bibo/"
    xmlns:dspace="http://digital-repositories.org/ontologies/dspace/0.1.0#"
    xmlns:foaf="http://xmlns.com/foaf/0.1/"
    xmlns:void="http://rdfs.org/ns/void#"
    xmlns:xsd="http://www.w3.org/2001/XMLSchema#" > 
  <rdf:Description rdf:about="https://kops.uni-konstanz.de/server/rdf/resource/123456789/51923">
    <dc:creator>Hamborg, Felix</dc:creator>
    <dcterms:rights rdf:resource="https://rightsstatements.org/page/InC/1.0/"/>
    <dc:rights>terms-of-use</dc:rights>
    <foaf:homepage rdf:resource="http://localhost:8080/"/>
    <dc:contributor>Zhukova, Anastasia</dc:contributor>
    <dc:contributor>Hamborg, Felix</dc:contributor>
    <dcterms:available rdf:datatype="http://www.w3.org/2001/XMLSchema#dateTime">2020-11-25T14:24:20Z</dcterms:available>
    <dc:creator>Gipp, Bela</dc:creator>
    <dcterms:isPartOf rdf:resource="https://kops.uni-konstanz.de/server/rdf/resource/123456789/36"/>
    <dc:contributor>Gipp, Bela</dc:contributor>
    <dc:creator>Zhukova, Anastasia</dc:creator>
    <dcterms:issued>2020</dcterms:issued>
    <dcterms:title>Interpretable and Comparative Textual Dataset Exploration Using Near-Identity Mention Relations</dcterms:title>
    <bibo:uri rdf:resource="https://kops.uni-konstanz.de/handle/123456789/51923"/>
    <dspace:isPartOfCollection rdf:resource="https://kops.uni-konstanz.de/server/rdf/resource/123456789/36"/>
    <void:sparqlEndpoint rdf:resource="http://localhost/fuseki/dspace/sparql"/>
    <dcterms:abstract xml:lang="eng">Dataset exploration is a set of techniques crucial in many research and data science projects. For textual datasets, commonly used techniques include topic modeling, document summarization, and methods related to dimension reduction. Despite their robustness, these techniques suffer from at least one of the following drawbacks: document summarization does not explicitly set documents in relation, the others yield summaries or topics that often are difficult to interpret and yield poor results for topics that consist of context-dependent terms. We propose a method for dataset exploration that employs cross-document near-identity resolution of mentions of semantic concepts, such as persons, other named entity types, events, actions. The method not only sets documents in relation and thus allows for comparative dataset exploration, but also yields well interpretable document representations. Additionally, due to the underlying approach for cross-document resolution of concept mentions, the method is able to set documents in relation as to their near-identity terms, e.g., synonyms that are not universally valid but only in the given dataset.</dcterms:abstract>
    <dc:date rdf:datatype="http://www.w3.org/2001/XMLSchema#dateTime">2020-11-25T14:24:20Z</dc:date>
    <dc:language>eng</dc:language>
  </rdf:Description>
</rdf:RDF>

Interner Vermerk

xmlui.Submission.submit.DescribeStep.inputForms.label.kops_note_fromSubmitter

Kontakt
URL der Originalveröffentl.

Prüfdatum der URL

Prüfungsdatum der Dissertation

Finanzierungsart

Kommentar zur Publikation

Allianzlizenz
Corresponding Authors der Uni Konstanz vorhanden
Internationale Co-Autor:innen
Universitätsbibliographie
Nein
Begutachtet
Diese Publikation teilen