Publikation: Visualization and Semantic Analysis of Relations in Textual Data
Dateien
Datum
Autor:innen
Herausgeber:innen
ISSN der Zeitschrift
Electronic ISSN
ISBN
Bibliografische Daten
Verlag
Schriftenreihe
Auflagebezeichnung
URI (zitierfähiger Link)
Internationale Patentnummer
Link zur Lizenz
Angaben zur Forschungsförderung
Projekt
Open Access-Veröffentlichung
Core Facility der Universität Konstanz
Titel in einer weiteren Sprache
Publikationstyp
Publikationsstatus
Erschienen in
Zusammenfassung
This thesis presents insights and methods for the extraction, analysis, and visualization of semantic relations in textual data. Using machine learning, it is now possible to extract semantic representatives of text tokens. These so-called embeddings are vector representations of individual components of text, e.g., words or tokens. Utilizing them, it is possible to get insights into the semantic content of text and use them in downstream tasks, such as question classification or nearest neighborhood search. We want to answer the following research questions: What kind of semantic relations can we visualize? And how can we improve traditional visualization methods using semantic relations? Although most of the use cases presented in this paper are from the domain of text visualization and linguistic insight generation, the tools can be used for any data type where semantic relations can be extracted. The first part of this thesis concerns the extraction and analysis of high-dimensional embeddings and their usage in linguistic application use cases. Throughout our long collaboration with the linguistics department, we have investigated suitable semantic representations for the task of question classification and created interfaces that were enriched and tailored to visualize semantic relations within textual data. The second part proposes tools to propagate and visualize semantic information within hierarchical structures and two general layout techniques for the visualization of relations using diagrams. We created a workspace for the guided generation of text, employing large language models in combination with a tree metaphor to visualize the beam search tree. Each node of the tree is visually anchored with semantic information. Semantic concepts are aggregated and visualized in a Voronoi treemap. To support more linguistic use cases, a set visualization conveys relations across small variations in the input text. Moving from pure textual data to general use cases, we investigated relation visualization techniques such as Euler diagrams. Here, we propose an extension to previous techniques that guarantees semantic relationships during layout construction. For more highly connected datasets, we propose spEuler, a method that automatically creates set curves using an iterative greedy approach. The created diagrams conform to previously established guidelines and produce compact and mostly simple visualizations. To also automatically place data elements inside the diagram and adapt the shape of the curves accordingly, we propose RectEuler. It uses rectangles as primitives and utilizes a linear programming approach. By applying a similarity-based splitting strategy, even complex datasets can be visualized. Finally, we propose new research directions for the visualization of Euler diagrams and other areas where integrating semantics into existing visualization techniques can offer improvements. An underexplored use case is hierarchical visualizations such as Voronoi treemaps, which traditionally only represent the hierarchical relations of the data and do not integrate semantic relations of the textual representations.
Zusammenfassung in einer weiteren Sprache
Fachgebiet (DDC)
Schlagwörter
Konferenz
Rezension
Zitieren
ISO 690
KEHLBECK, Rebecca, 2024. Visualization and Semantic Analysis of Relations in Textual Data [Dissertation]. Konstanz: Universität KonstanzBibTex
@phdthesis{Kehlbeck2024Visua-73198, title={Visualization and Semantic Analysis of Relations in Textual Data}, year={2024}, author={Kehlbeck, Rebecca}, address={Konstanz}, school={Universität Konstanz} }
RDF
<rdf:RDF xmlns:dcterms="http://purl.org/dc/terms/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:bibo="http://purl.org/ontology/bibo/" xmlns:dspace="http://digital-repositories.org/ontologies/dspace/0.1.0#" xmlns:foaf="http://xmlns.com/foaf/0.1/" xmlns:void="http://rdfs.org/ns/void#" xmlns:xsd="http://www.w3.org/2001/XMLSchema#" > <rdf:Description rdf:about="https://kops.uni-konstanz.de/server/rdf/resource/123456789/73198"> <void:sparqlEndpoint rdf:resource="http://localhost/fuseki/dspace/sparql"/> <dc:date rdf:datatype="http://www.w3.org/2001/XMLSchema#dateTime">2025-04-30T13:50:53Z</dc:date> <dspace:isPartOfCollection rdf:resource="https://kops.uni-konstanz.de/server/rdf/resource/123456789/36"/> <foaf:homepage rdf:resource="http://localhost:8080/"/> <dcterms:rights rdf:resource="https://rightsstatements.org/page/InC/1.0/"/> <dcterms:hasPart rdf:resource="https://kops.uni-konstanz.de/bitstream/123456789/73198/4/Kehlbeck_2-1878tfo19gie56.pdf"/> <dcterms:issued>2024</dcterms:issued> <dc:creator>Kehlbeck, Rebecca</dc:creator> <bibo:uri rdf:resource="https://kops.uni-konstanz.de/handle/123456789/73198"/> <dc:contributor>Kehlbeck, Rebecca</dc:contributor> <dspace:hasBitstream rdf:resource="https://kops.uni-konstanz.de/bitstream/123456789/73198/4/Kehlbeck_2-1878tfo19gie56.pdf"/> <dcterms:available rdf:datatype="http://www.w3.org/2001/XMLSchema#dateTime">2025-04-30T13:50:53Z</dcterms:available> <dcterms:title>Visualization and Semantic Analysis of Relations in Textual Data</dcterms:title> <dc:rights>terms-of-use</dc:rights> <dcterms:abstract>This thesis presents insights and methods for the extraction, analysis, and visualization of semantic relations in textual data. Using machine learning, it is now possible to extract semantic representatives of text tokens. These so-called embeddings are vector representations of individual components of text, e.g., words or tokens. Utilizing them, it is possible to get insights into the semantic content of text and use them in downstream tasks, such as question classification or nearest neighborhood search. We want to answer the following research questions: What kind of semantic relations can we visualize? And how can we improve traditional visualization methods using semantic relations? Although most of the use cases presented in this paper are from the domain of text visualization and linguistic insight generation, the tools can be used for any data type where semantic relations can be extracted. The first part of this thesis concerns the extraction and analysis of high-dimensional embeddings and their usage in linguistic application use cases. Throughout our long collaboration with the linguistics department, we have investigated suitable semantic representations for the task of question classification and created interfaces that were enriched and tailored to visualize semantic relations within textual data. The second part proposes tools to propagate and visualize semantic information within hierarchical structures and two general layout techniques for the visualization of relations using diagrams. We created a workspace for the guided generation of text, employing large language models in combination with a tree metaphor to visualize the beam search tree. Each node of the tree is visually anchored with semantic information. Semantic concepts are aggregated and visualized in a Voronoi treemap. To support more linguistic use cases, a set visualization conveys relations across small variations in the input text. Moving from pure textual data to general use cases, we investigated relation visualization techniques such as Euler diagrams. Here, we propose an extension to previous techniques that guarantees semantic relationships during layout construction. For more highly connected datasets, we propose spEuler, a method that automatically creates set curves using an iterative greedy approach. The created diagrams conform to previously established guidelines and produce compact and mostly simple visualizations. To also automatically place data elements inside the diagram and adapt the shape of the curves accordingly, we propose RectEuler. It uses rectangles as primitives and utilizes a linear programming approach. By applying a similarity-based splitting strategy, even complex datasets can be visualized. Finally, we propose new research directions for the visualization of Euler diagrams and other areas where integrating semantics into existing visualization techniques can offer improvements. An underexplored use case is hierarchical visualizations such as Voronoi treemaps, which traditionally only represent the hierarchical relations of the data and do not integrate semantic relations of the textual representations.</dcterms:abstract> <dc:language>eng</dc:language> <dcterms:isPartOf rdf:resource="https://kops.uni-konstanz.de/server/rdf/resource/123456789/36"/> </rdf:Description> </rdf:RDF>