Publikation:

Visualization and Semantic Analysis of Relations in Textual Data

Lade...
Vorschaubild

Dateien

Kehlbeck_2-1878tfo19gie56.pdf
Kehlbeck_2-1878tfo19gie56.pdfGröße: 17.94 MBDownloads: 14

Datum

2024

Herausgeber:innen

Kontakt

ISSN der Zeitschrift

Electronic ISSN

ISBN

Bibliografische Daten

Verlag

Schriftenreihe

Auflagebezeichnung

DOI (zitierfähiger Link)
ArXiv-ID

Internationale Patentnummer

Angaben zur Forschungsförderung

Projekt

Open Access-Veröffentlichung
Open Access Green
Core Facility der Universität Konstanz

Gesperrt bis

Titel in einer weiteren Sprache

Publikationstyp
Dissertation
Publikationsstatus
Published

Erschienen in

Zusammenfassung

This thesis presents insights and methods for the extraction, analysis, and visualization of semantic relations in textual data. Using machine learning, it is now possible to extract semantic representatives of text tokens. These so-called embeddings are vector representations of individual components of text, e.g., words or tokens. Utilizing them, it is possible to get insights into the semantic content of text and use them in downstream tasks, such as question classification or nearest neighborhood search. We want to answer the following research questions: What kind of semantic relations can we visualize? And how can we improve traditional visualization methods using semantic relations? Although most of the use cases presented in this paper are from the domain of text visualization and linguistic insight generation, the tools can be used for any data type where semantic relations can be extracted. The first part of this thesis concerns the extraction and analysis of high-dimensional embeddings and their usage in linguistic application use cases. Throughout our long collaboration with the linguistics department, we have investigated suitable semantic representations for the task of question classification and created interfaces that were enriched and tailored to visualize semantic relations within textual data. The second part proposes tools to propagate and visualize semantic information within hierarchical structures and two general layout techniques for the visualization of relations using diagrams. We created a workspace for the guided generation of text, employing large language models in combination with a tree metaphor to visualize the beam search tree. Each node of the tree is visually anchored with semantic information. Semantic concepts are aggregated and visualized in a Voronoi treemap. To support more linguistic use cases, a set visualization conveys relations across small variations in the input text. Moving from pure textual data to general use cases, we investigated relation visualization techniques such as Euler diagrams. Here, we propose an extension to previous techniques that guarantees semantic relationships during layout construction. For more highly connected datasets, we propose spEuler, a method that automatically creates set curves using an iterative greedy approach. The created diagrams conform to previously established guidelines and produce compact and mostly simple visualizations. To also automatically place data elements inside the diagram and adapt the shape of the curves accordingly, we propose RectEuler. It uses rectangles as primitives and utilizes a linear programming approach. By applying a similarity-based splitting strategy, even complex datasets can be visualized. Finally, we propose new research directions for the visualization of Euler diagrams and other areas where integrating semantics into existing visualization techniques can offer improvements. An underexplored use case is hierarchical visualizations such as Voronoi treemaps, which traditionally only represent the hierarchical relations of the data and do not integrate semantic relations of the textual representations.

Zusammenfassung in einer weiteren Sprache

Fachgebiet (DDC)
004 Informatik

Schlagwörter

Konferenz

Rezension
undefined / . - undefined, undefined

Forschungsvorhaben

Organisationseinheiten

Zeitschriftenheft

Zugehörige Datensätze in KOPS

Zitieren

ISO 690KEHLBECK, Rebecca, 2024. Visualization and Semantic Analysis of Relations in Textual Data [Dissertation]. Konstanz: Universität Konstanz
BibTex
@phdthesis{Kehlbeck2024Visua-73198,
  title={Visualization and Semantic Analysis of Relations in Textual Data},
  year={2024},
  author={Kehlbeck, Rebecca},
  address={Konstanz},
  school={Universität Konstanz}
}
RDF
<rdf:RDF
    xmlns:dcterms="http://purl.org/dc/terms/"
    xmlns:dc="http://purl.org/dc/elements/1.1/"
    xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
    xmlns:bibo="http://purl.org/ontology/bibo/"
    xmlns:dspace="http://digital-repositories.org/ontologies/dspace/0.1.0#"
    xmlns:foaf="http://xmlns.com/foaf/0.1/"
    xmlns:void="http://rdfs.org/ns/void#"
    xmlns:xsd="http://www.w3.org/2001/XMLSchema#" > 
  <rdf:Description rdf:about="https://kops.uni-konstanz.de/server/rdf/resource/123456789/73198">
    <void:sparqlEndpoint rdf:resource="http://localhost/fuseki/dspace/sparql"/>
    <dc:date rdf:datatype="http://www.w3.org/2001/XMLSchema#dateTime">2025-04-30T13:50:53Z</dc:date>
    <dspace:isPartOfCollection rdf:resource="https://kops.uni-konstanz.de/server/rdf/resource/123456789/36"/>
    <foaf:homepage rdf:resource="http://localhost:8080/"/>
    <dcterms:rights rdf:resource="https://rightsstatements.org/page/InC/1.0/"/>
    <dcterms:hasPart rdf:resource="https://kops.uni-konstanz.de/bitstream/123456789/73198/4/Kehlbeck_2-1878tfo19gie56.pdf"/>
    <dcterms:issued>2024</dcterms:issued>
    <dc:creator>Kehlbeck, Rebecca</dc:creator>
    <bibo:uri rdf:resource="https://kops.uni-konstanz.de/handle/123456789/73198"/>
    <dc:contributor>Kehlbeck, Rebecca</dc:contributor>
    <dspace:hasBitstream rdf:resource="https://kops.uni-konstanz.de/bitstream/123456789/73198/4/Kehlbeck_2-1878tfo19gie56.pdf"/>
    <dcterms:available rdf:datatype="http://www.w3.org/2001/XMLSchema#dateTime">2025-04-30T13:50:53Z</dcterms:available>
    <dcterms:title>Visualization and Semantic Analysis of Relations in Textual Data</dcterms:title>
    <dc:rights>terms-of-use</dc:rights>
    <dcterms:abstract>This thesis presents insights and methods for the extraction, analysis, and visualization of semantic relations in textual data. Using machine learning, it is now possible to extract semantic representatives of text tokens. These so-called embeddings are vector representations of individual components of text, e.g., words or tokens. Utilizing them, it is possible to get insights into the semantic content of text and use them in downstream tasks, such as question classification or nearest neighborhood search. We want to answer the following research questions: What kind of semantic relations can we visualize? And how can we improve traditional visualization methods using semantic relations? Although most of the use cases presented in this paper are from the domain of text visualization and linguistic insight generation, the tools can be used for any data type where semantic relations can be extracted. The first part of this thesis concerns the extraction and analysis of high-dimensional embeddings and their usage in linguistic application use cases. Throughout our long collaboration with the linguistics department, we have investigated suitable semantic
representations for the task of question classification and created interfaces that were enriched and tailored to visualize semantic relations within textual data. The second part proposes tools to propagate and visualize semantic information within hierarchical structures and two general layout techniques for the visualization of relations using diagrams. We created a workspace for the guided generation of text, employing large language models in combination with a tree metaphor to visualize the beam search tree. Each node of the tree is visually anchored with semantic information. Semantic concepts are aggregated and visualized in a Voronoi treemap. To support more linguistic use cases, a set visualization conveys relations across small variations in the input text. 
Moving from pure textual data to general use cases, we investigated relation visualization techniques such as Euler diagrams. Here, we propose an extension to previous techniques that guarantees semantic relationships during layout construction. For more highly connected datasets, we propose spEuler, a method that automatically creates set curves using an iterative greedy approach. The created diagrams conform to previously established guidelines and produce compact and mostly simple visualizations. To also automatically
place data elements inside the diagram and adapt the shape of the curves accordingly, we propose RectEuler. It uses rectangles as primitives and utilizes a linear programming approach. By applying a similarity-based splitting strategy, even complex datasets can be visualized.
Finally, we propose new research directions for the visualization of Euler diagrams and other areas where integrating semantics into existing visualization techniques can offer improvements. An underexplored use case is hierarchical visualizations such as Voronoi treemaps, which traditionally only represent the hierarchical relations of the data and do not integrate semantic relations of the textual representations.</dcterms:abstract>
    <dc:language>eng</dc:language>
    <dcterms:isPartOf rdf:resource="https://kops.uni-konstanz.de/server/rdf/resource/123456789/36"/>
  </rdf:Description>
</rdf:RDF>

Interner Vermerk

xmlui.Submission.submit.DescribeStep.inputForms.label.kops_note_fromSubmitter

Kontakt
URL der Originalveröffentl.

Prüfdatum der URL

Prüfungsdatum der Dissertation

May 16, 2024
Hochschulschriftenvermerk
Konstanz, Univ., Diss., 2024
Finanzierungsart

Kommentar zur Publikation

Allianzlizenz
Corresponding Authors der Uni Konstanz vorhanden
Internationale Co-Autor:innen
Universitätsbibliographie
Ja
Begutachtet
Diese Publikation teilen