The Categorical Data Map : A Multidimensional Scaling-Based Approach

dc.contributor.authorDennig, Frederik L.
dc.contributor.authorJoos, Lucas
dc.contributor.authorPaetzold, Patrick
dc.contributor.authorBlumberg, Daniela
dc.contributor.authorDeussen, Oliver
dc.contributor.authorKeim, Daniel A.
dc.contributor.authorFischer, Maximilian T.
dc.date.accessioned2025-01-16T14:15:47Z
dc.date.available2025-01-16T14:15:47Z
dc.date.issued2024-10-14
dc.description.abstractCategorical data does not have an intrinsic definition of distance or order, and therefore, established visualization techniques for categorical data only allow for a set-based or frequency-based analysis, e.g., through Euler diagrams or Parallel Sets, and do not support a similarity-based analysis. We present a novel dimensionality reduction-based visualization for categorical data, which is based on defining the distance of two data items as the number of varying attributes. Our technique enables users to pre-attentively detect groups of similar data items and observe the properties of the projection, such as attributes strongly influencing the embedding. Our prototype visually encodes data properties in an enhanced scatterplot-like visualization, visualizing attributes in the background to show the distribution of categories. In addition, we propose two graph-based measures to quantify the plot’s visual quality, which rank attributes according to their contribution to cluster cohesion. To demonstrate the capabilities of our similarity-based projection method, we compare it to Euler diagrams and Parallel Sets regarding visual scalability and evaluate it quantitatively on seven real-world datasets using a range of common quality measures. Further, we validate the benefits of our approach through an expert study with five data scientists analyzing the Titanic and Mushroom dataset with up to 23 attributes and 8124 category combinations. Our results indicate that our Categorical Data Map offers an effective analysis method for large datasets with a high number of category combinations.
dc.description.versionpublished
dc.identifier.doi10.1109/vds63897.2024.00008
dc.identifier.ppn1968521437
dc.identifier.urihttps://kops.uni-konstanz.de/handle/123456789/71937
dc.language.isoeng
dc.rightsterms-of-use
dc.rights.urihttps://rightsstatements.org/page/InC/1.0/
dc.subject.ddc004
dc.titleThe Categorical Data Map : A Multidimensional Scaling-Based Approacheng
dc.typeINPROCEEDINGS
dspace.entity.typePublication
kops.citation.bibtex
@inproceedings{Dennig2024-10-14Categ-71937,
  title={The Categorical Data Map : A Multidimensional Scaling-Based Approach},
  year={2024},
  doi={10.1109/vds63897.2024.00008},
  isbn={979-8-3315-2843-0},
  address={Piscataway, NJ},
  publisher={IEEE},
  booktitle={2024 IEEE Visualization in Data Science, VDS 2024, Proceedings},
  pages={25--34},
  author={Dennig, Frederik L. and Joos, Lucas and Paetzold, Patrick and Blumberg, Daniela and Deussen, Oliver and Keim, Daniel A. and Fischer, Maximilian T.}
}
kops.citation.iso690DENNIG, Frederik L., Lucas JOOS, Patrick PAETZOLD, Daniela BLUMBERG, Oliver DEUSSEN, Daniel A. KEIM, Maximilian T. FISCHER, 2024. The Categorical Data Map : A Multidimensional Scaling-Based Approach. VDS 2024: Visualization in Data Science. St. Pete Beach, FL, USA, 14. Okt. 2024. In: 2024 IEEE Visualization in Data Science, VDS 2024, Proceedings. Piscataway, NJ: IEEE, 2024, S. 25-34. ISBN 979-8-3315-2843-0. Verfügbar unter: doi: 10.1109/vds63897.2024.00008deu
kops.citation.iso690DENNIG, Frederik L., Lucas JOOS, Patrick PAETZOLD, Daniela BLUMBERG, Oliver DEUSSEN, Daniel A. KEIM, Maximilian T. FISCHER, 2024. The Categorical Data Map : A Multidimensional Scaling-Based Approach. VDS 2024: Visualization in Data Science. St. Pete Beach, FL, USA, Oct 14, 2024. In: 2024 IEEE Visualization in Data Science, VDS 2024, Proceedings. Piscataway, NJ: IEEE, 2024, pp. 25-34. ISBN 979-8-3315-2843-0. Available under: doi: 10.1109/vds63897.2024.00008eng
kops.citation.rdf
<rdf:RDF
    xmlns:dcterms="http://purl.org/dc/terms/"
    xmlns:dc="http://purl.org/dc/elements/1.1/"
    xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
    xmlns:bibo="http://purl.org/ontology/bibo/"
    xmlns:dspace="http://digital-repositories.org/ontologies/dspace/0.1.0#"
    xmlns:foaf="http://xmlns.com/foaf/0.1/"
    xmlns:void="http://rdfs.org/ns/void#"
    xmlns:xsd="http://www.w3.org/2001/XMLSchema#">
  <rdf:Description rdf:about="https://kops.uni-konstanz.de/server/rdf/resource/123456789/71937">
    <dc:creator>Blumberg, Daniela</dc:creator>
    <dc:contributor>Deussen, Oliver</dc:contributor>
    <dc:creator>Joos, Lucas</dc:creator>
    <dc:contributor>Dennig, Frederik L.</dc:contributor>
    <dc:language>eng</dc:language>
    <dcterms:abstract>Categorical data does not have an intrinsic definition of distance or order, and therefore, established visualization techniques for categorical data only allow for a set-based or frequency-based analysis, e.g., through Euler diagrams or Parallel Sets, and do not support a similarity-based analysis. We present a novel dimensionality reduction-based visualization for categorical data, which is based on defining the distance of two data items as the number of varying attributes. Our technique enables users to pre-attentively detect groups of similar data items and observe the properties of the projection, such as attributes strongly influencing the embedding. Our prototype visually encodes data properties in an enhanced scatterplot-like visualization, visualizing attributes in the background to show the distribution of categories. In addition, we propose two graph-based measures to quantify the plot’s visual quality, which rank attributes according to their contribution to cluster cohesion. To demonstrate the capabilities of our similarity-based projection method, we compare it to Euler diagrams and Parallel Sets regarding visual scalability and evaluate it quantitatively on seven real-world datasets using a range of common quality measures. Further, we validate the benefits of our approach through an expert study with five data scientists analyzing the Titanic and Mushroom dataset with up to 23 attributes and 8124 category combinations. Our results indicate that our Categorical Data Map offers an effective analysis method for large datasets with a high number of category combinations.</dcterms:abstract>
    <foaf:homepage rdf:resource="http://localhost:8080/"/>
    <dcterms:isPartOf rdf:resource="https://kops.uni-konstanz.de/server/rdf/resource/123456789/36"/>
    <dcterms:hasPart rdf:resource="https://kops.uni-konstanz.de/bitstream/123456789/71937/1/Dennig_2-100fan7vzugf34.pdf"/>
    <dc:creator>Deussen, Oliver</dc:creator>
    <bibo:uri rdf:resource="https://kops.uni-konstanz.de/handle/123456789/71937"/>
    <dspace:hasBitstream rdf:resource="https://kops.uni-konstanz.de/bitstream/123456789/71937/1/Dennig_2-100fan7vzugf34.pdf"/>
    <void:sparqlEndpoint rdf:resource="http://localhost/fuseki/dspace/sparql"/>
    <dc:date rdf:datatype="http://www.w3.org/2001/XMLSchema#dateTime"
    >2025-01-16T14:15:47Z</dc:date>
    <dc:creator>Paetzold, Patrick</dc:creator>
    <dc:contributor>Fischer, Maximilian T.</dc:contributor>
    <dc:creator>Fischer, Maximilian T.</dc:creator>
    <dspace:isPartOfCollection rdf:resource="https://kops.uni-konstanz.de/server/rdf/resource/123456789/36"/>
    <dcterms:issued>2024-10-14</dcterms:issued>
    <dc:contributor>Paetzold, Patrick</dc:contributor>
    <dc:rights>terms-of-use</dc:rights>
    <dcterms:title>The Categorical Data Map : A Multidimensional Scaling-Based Approach</dcterms:title>
    <dc:contributor>Joos, Lucas</dc:contributor>
    <dc:creator>Dennig, Frederik L.</dc:creator>
    <dc:contributor>Keim, Daniel A.</dc:contributor>
    <dc:creator>Keim, Daniel A.</dc:creator>
    <dcterms:available rdf:datatype="http://www.w3.org/2001/XMLSchema#dateTime"
    >2025-01-16T14:15:47Z</dcterms:available>
    <dcterms:rights rdf:resource="https://rightsstatements.org/page/InC/1.0/"/>
    <dc:contributor>Blumberg, Daniela</dc:contributor>
  </rdf:Description>
</rdf:RDF>
kops.conferencefieldVDS 2024: Visualization in Data Science, 14. Okt. 2024, St. Pete Beach, FL, USAdeu
kops.date.conferenceStart2024-10-14
kops.description.openAccessopenaccessgreen
kops.flag.knbibliographytrue
kops.identifier.nbnurn:nbn:de:bsz:352-2-100fan7vzugf34
kops.location.conferenceSt. Pete Beach, FL, USA
kops.sourcefield<i>2024 IEEE Visualization in Data Science, VDS 2024, Proceedings</i>. Piscataway, NJ: IEEE, 2024, S. 25-34. ISBN 979-8-3315-2843-0. Verfügbar unter: doi: 10.1109/vds63897.2024.00008deu
kops.sourcefield.plain2024 IEEE Visualization in Data Science, VDS 2024, Proceedings. Piscataway, NJ: IEEE, 2024, S. 25-34. ISBN 979-8-3315-2843-0. Verfügbar unter: doi: 10.1109/vds63897.2024.00008deu
kops.sourcefield.plain2024 IEEE Visualization in Data Science, VDS 2024, Proceedings. Piscataway, NJ: IEEE, 2024, pp. 25-34. ISBN 979-8-3315-2843-0. Available under: doi: 10.1109/vds63897.2024.00008eng
kops.title.conferenceVDS 2024: Visualization in Data Science
relation.isAuthorOfPublicationd20de83d-d64d-49a6-9fbc-26c27d4b6799
relation.isAuthorOfPublicationbfbe0c3f-960a-4409-a537-02b3a287d205
relation.isAuthorOfPublication447f40db-b045-458b-878d-73c4b13eba2c
relation.isAuthorOfPublication4202cfa7-dff9-4e87-88ea-b5eb7a8ce807
relation.isAuthorOfPublication4e85f041-bb89-4e27-b7d6-acd814feacb8
relation.isAuthorOfPublicationda7dafb0-6003-4fd4-803c-11e1e72d621a
relation.isAuthorOfPublicationb136ae03-c489-4019-9c45-dda441af1d49
relation.isAuthorOfPublication.latestForDiscoveryd20de83d-d64d-49a6-9fbc-26c27d4b6799
relation.isDatasetOfPublication2fbf9d27-9c7a-493f-a049-af94b5ce8390
relation.isDatasetOfPublication.latestForDiscovery2fbf9d27-9c7a-493f-a049-af94b5ce8390
source.bibliographicInfo.fromPage25
source.bibliographicInfo.toPage34
source.identifier.isbn979-8-3315-2843-0
source.publisherIEEE
source.publisher.locationPiscataway, NJ
source.title2024 IEEE Visualization in Data Science, VDS 2024, Proceedings

Dateien

Originalbündel

Gerade angezeigt 1 - 1 von 1
Vorschaubild nicht verfügbar
Name:
Dennig_2-100fan7vzugf34.pdf
Größe:
2.27 MB
Format:
Adobe Portable Document Format
Dennig_2-100fan7vzugf34.pdf
Dennig_2-100fan7vzugf34.pdfGröße: 2.27 MBDownloads: 1