Publikation:

Discovering OLAP Dimensions in Semi-Structured Data

Lade...
Vorschaubild

Dateien

Mansmann_258286.pdf
Mansmann_258286.pdfGröße: 1.5 MBDownloads: 1578

Datum

2014

Herausgeber:innen

Kontakt

ISSN der Zeitschrift

Electronic ISSN

ISBN

Bibliografische Daten

Verlag

Schriftenreihe

Auflagebezeichnung

ArXiv-ID

Internationale Patentnummer

Angaben zur Forschungsförderung

Projekt

Open Access-Veröffentlichung
Open Access Green
Core Facility der Universität Konstanz

Gesperrt bis

Titel in einer weiteren Sprache

Publikationstyp
Zeitschriftenartikel
Publikationsstatus
Published

Erschienen in

Information Systems. 2014, 44, pp. 120-133. ISSN 0306-4379. eISSN 0306-4379. Available under: doi: 10.1016/j.is.2013.09.002

Zusammenfassung

OLAP cubes enable aggregation-centric analysis of transactional data by shaping data records into measurable facts with dimensional characteristics. A multidimensional view is obtained from the available data fields and explicit relationships between them. This classical modeling approach is not feasible for scenarios dealing with semi-structured or poorly structured data. We propose to the data warehouse design methodology with a content-driven discovery of measures and dimensions in the original dataset. Our approach is based on introducing a data enrichment layer responsible for detecting new structural elements in the data using data mining and other techniques. Discovered elements can be of type measure, dimension, or hierarchy level and may represent static or even dynamic properties of the data. This paper focuses on the challenge of generating, maintaining, and querying discovered elements in OLAP cubes.



We demonstrate the power of our approach by providing OLAP to the public stream of user-generated content on the Twitter platform. We have been able to enrich the original set with dynamic characteristics, such as user activity, popularity, messaging behavior, as well as to classify messages by topic, impact, origin, method of generation, etc. Knowledge discovery techniques coupled with human expertise enable structural enrichment of the original data beyond the scope of the existing methods for obtaining multidimensional models from relational or semi-structured data.

Zusammenfassung in einer weiteren Sprache

Fachgebiet (DDC)
004 Informatik

Schlagwörter

Data warehousing, OLAP, Multidimensional data model, Semi-structured data

Konferenz

Rezension
undefined / . - undefined, undefined

Forschungsvorhaben

Organisationseinheiten

Zeitschriftenheft

Zugehörige Datensätze in KOPS

Zitieren

ISO 690MANSMANN, Svetlana, Nafees Ur REHMAN, Andreas WEILER, Marc H. SCHOLL, 2014. Discovering OLAP Dimensions in Semi-Structured Data. In: Information Systems. 2014, 44, pp. 120-133. ISSN 0306-4379. eISSN 0306-4379. Available under: doi: 10.1016/j.is.2013.09.002
BibTex
@article{Mansmann2014Disco-25828,
  year={2014},
  doi={10.1016/j.is.2013.09.002},
  title={Discovering OLAP Dimensions in Semi-Structured Data},
  volume={44},
  issn={0306-4379},
  journal={Information Systems},
  pages={120--133},
  author={Mansmann, Svetlana and Rehman, Nafees Ur and Weiler, Andreas and Scholl, Marc H.}
}
RDF
<rdf:RDF
    xmlns:dcterms="http://purl.org/dc/terms/"
    xmlns:dc="http://purl.org/dc/elements/1.1/"
    xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
    xmlns:bibo="http://purl.org/ontology/bibo/"
    xmlns:dspace="http://digital-repositories.org/ontologies/dspace/0.1.0#"
    xmlns:foaf="http://xmlns.com/foaf/0.1/"
    xmlns:void="http://rdfs.org/ns/void#"
    xmlns:xsd="http://www.w3.org/2001/XMLSchema#" > 
  <rdf:Description rdf:about="https://kops.uni-konstanz.de/server/rdf/resource/123456789/25828">
    <dcterms:title>Discovering OLAP Dimensions in Semi-Structured Data</dcterms:title>
    <bibo:uri rdf:resource="http://kops.uni-konstanz.de/handle/123456789/25828"/>
    <dc:contributor>Rehman, Nafees Ur</dc:contributor>
    <dc:creator>Scholl, Marc H.</dc:creator>
    <dcterms:hasPart rdf:resource="https://kops.uni-konstanz.de/bitstream/123456789/25828/2/Mansmann_258286.pdf"/>
    <dc:creator>Mansmann, Svetlana</dc:creator>
    <dcterms:abstract xml:lang="eng">OLAP cubes enable aggregation-centric analysis of transactional data by shaping data records into measurable facts with dimensional characteristics. A multidimensional view is obtained from the available data fields and explicit relationships between them. This classical modeling approach is not feasible for scenarios dealing with semi-structured or poorly structured data. We propose to the data warehouse design methodology with a content-driven discovery of measures and dimensions in the original dataset. Our approach is based on introducing a data enrichment layer responsible for detecting new structural elements in the data using data mining and other techniques. Discovered elements can be of type measure, dimension, or hierarchy level and may represent static or even dynamic properties of the data. This paper focuses on the challenge of generating, maintaining, and querying discovered elements in OLAP cubes.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;We demonstrate the power of our approach by providing OLAP to the public stream of user-generated content on the Twitter platform. We have been able to enrich the original set with dynamic characteristics, such as user activity, popularity, messaging behavior, as well as to classify messages by topic, impact, origin, method of generation, etc. Knowledge discovery techniques coupled with human expertise enable structural enrichment of the original data beyond the scope of the existing methods for obtaining multidimensional models from relational or semi-structured data.</dcterms:abstract>
    <dspace:hasBitstream rdf:resource="https://kops.uni-konstanz.de/bitstream/123456789/25828/2/Mansmann_258286.pdf"/>
    <dcterms:rights rdf:resource="https://rightsstatements.org/page/InC/1.0/"/>
    <dc:creator>Weiler, Andreas</dc:creator>
    <dc:creator>Rehman, Nafees Ur</dc:creator>
    <dc:contributor>Scholl, Marc H.</dc:contributor>
    <dspace:isPartOfCollection rdf:resource="https://kops.uni-konstanz.de/server/rdf/resource/123456789/36"/>
    <dcterms:isPartOf rdf:resource="https://kops.uni-konstanz.de/server/rdf/resource/123456789/36"/>
    <dc:contributor>Mansmann, Svetlana</dc:contributor>
    <dcterms:available rdf:datatype="http://www.w3.org/2001/XMLSchema#dateTime">2014-01-13T13:48:42Z</dcterms:available>
    <dcterms:issued>2014</dcterms:issued>
    <dc:rights>terms-of-use</dc:rights>
    <foaf:homepage rdf:resource="http://localhost:8080/"/>
    <dc:date rdf:datatype="http://www.w3.org/2001/XMLSchema#dateTime">2014-01-13T13:48:42Z</dc:date>
    <void:sparqlEndpoint rdf:resource="http://localhost/fuseki/dspace/sparql"/>
    <dc:language>eng</dc:language>
    <dc:contributor>Weiler, Andreas</dc:contributor>
    <dcterms:bibliographicCitation>Information Systems ; 44 (2014). - S. 120-133</dcterms:bibliographicCitation>
  </rdf:Description>
</rdf:RDF>

Interner Vermerk

xmlui.Submission.submit.DescribeStep.inputForms.label.kops_note_fromSubmitter

Kontakt
URL der Originalveröffentl.

Prüfdatum der URL

Prüfungsdatum der Dissertation

Finanzierungsart

Kommentar zur Publikation

Allianzlizenz
Corresponding Authors der Uni Konstanz vorhanden
Internationale Co-Autor:innen
Universitätsbibliographie
Ja
Begutachtet
Diese Publikation teilen