MARK-AGE data management : Cleaning, exploration and visualization of data

Lade...
Vorschaubild
Dateien
Baur_0-295717.pdf
Baur_0-295717.pdfGröße: 1.9 MBDownloads: 471
Datum
2015
Herausgeber:innen
Kontakt
ISSN der Zeitschrift
Electronic ISSN
ISBN
Bibliografische Daten
Verlag
Schriftenreihe
Auflagebezeichnung
ArXiv-ID
Internationale Patentnummer
Link zur Lizenz
Angaben zur Forschungsförderung
Projekt
Open Access-Veröffentlichung
Open Access Hybrid
Core Facility der Universität Konstanz
Gesperrt bis
Titel in einer weiteren Sprache
Forschungsvorhaben
Organisationseinheiten
Zeitschriftenheft
Publikationstyp
Zeitschriftenartikel
Publikationsstatus
Published
Erschienen in
Mechanisms of Ageing and Development. 2015, 151, pp. 38-44. ISSN 0047-6374. eISSN 1872-6216. Available under: doi: 10.1016/j.mad.2015.05.007
Zusammenfassung

Databases are an organized collection of data and necessary to investigate a wide spectrum of research questions. For data evaluation analyzers should be aware of possible data quality problems that can compromise results validity. Therefore data cleaning is an essential part of the data management process, which deals with the identification and correction of errors in order to improve data quality.
In our cross-sectional study, biomarkers of ageing, analytical, anthropometric and demographic data from about 3000 volunteers have been collected in the MARK-AGE database. Although several preventive strategies were applied before data entry, errors like miscoding, missing values, batch problems etc., could not be avoided completely. Such errors can result in misleading information and affect the validity of the performed data analysis.
Here we present an overview of the methods we applied for dealing with errors in the MARK-AGE database. We especially describe our strategies for the detection of missing values, outliers and batch effects and explain how they can be handled to improve data quality. Finally we report about the tools used for data exploration and data sharing between MARK-AGE collaborators.

Zusammenfassung in einer weiteren Sprache
Fachgebiet (DDC)
004 Informatik
Schlagwörter
Konferenz
Rezension
undefined / . - undefined, undefined
Zitieren
ISO 690BAUR, Jennifer, Maria MORENO-VILLANUEVA, Tobias KÖTTER, Thilo SINDLINGER, Alexander BÜRKLE, Michael R. BERTHOLD, Michael JUNK, 2015. MARK-AGE data management : Cleaning, exploration and visualization of data. In: Mechanisms of Ageing and Development. 2015, 151, pp. 38-44. ISSN 0047-6374. eISSN 1872-6216. Available under: doi: 10.1016/j.mad.2015.05.007
BibTex
@article{Baur2015-11MARKA-31306,
  year={2015},
  doi={10.1016/j.mad.2015.05.007},
  title={MARK-AGE data management : Cleaning, exploration and visualization of data},
  volume={151},
  issn={0047-6374},
  journal={Mechanisms of Ageing and Development},
  pages={38--44},
  author={Baur, Jennifer and Moreno-Villanueva, Maria and Kötter, Tobias and Sindlinger, Thilo and Bürkle, Alexander and Berthold, Michael R. and Junk, Michael}
}
RDF
<rdf:RDF
    xmlns:dcterms="http://purl.org/dc/terms/"
    xmlns:dc="http://purl.org/dc/elements/1.1/"
    xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
    xmlns:bibo="http://purl.org/ontology/bibo/"
    xmlns:dspace="http://digital-repositories.org/ontologies/dspace/0.1.0#"
    xmlns:foaf="http://xmlns.com/foaf/0.1/"
    xmlns:void="http://rdfs.org/ns/void#"
    xmlns:xsd="http://www.w3.org/2001/XMLSchema#" > 
  <rdf:Description rdf:about="https://kops.uni-konstanz.de/server/rdf/resource/123456789/31306">
    <dc:creator>Moreno-Villanueva, Maria</dc:creator>
    <dc:rights>Attribution-NonCommercial-NoDerivatives 4.0 International</dc:rights>
    <dc:creator>Junk, Michael</dc:creator>
    <dc:contributor>Bürkle, Alexander</dc:contributor>
    <dc:creator>Bürkle, Alexander</dc:creator>
    <dc:creator>Kötter, Tobias</dc:creator>
    <dcterms:abstract xml:lang="eng">Databases are an organized collection of data and necessary to investigate a wide spectrum of research questions. For data evaluation analyzers should be aware of possible data quality problems that can compromise results validity. Therefore data cleaning is an essential part of the data management process, which deals with the identification and correction of errors in order to improve data quality.&lt;br /&gt;In our cross-sectional study, biomarkers of ageing, analytical, anthropometric and demographic data from about 3000 volunteers have been collected in the MARK-AGE database. Although several preventive strategies were applied before data entry, errors like miscoding, missing values, batch problems etc., could not be avoided completely. Such errors can result in misleading information and affect the validity of the performed data analysis.&lt;br /&gt;Here we present an overview of the methods we applied for dealing with errors in the MARK-AGE database. We especially describe our strategies for the detection of missing values, outliers and batch effects and explain how they can be handled to improve data quality. Finally we report about the tools used for data exploration and data sharing between MARK-AGE collaborators.</dcterms:abstract>
    <dc:creator>Berthold, Michael R.</dc:creator>
    <dc:creator>Baur, Jennifer</dc:creator>
    <dc:contributor>Sindlinger, Thilo</dc:contributor>
    <dc:contributor>Kötter, Tobias</dc:contributor>
    <dc:creator>Sindlinger, Thilo</dc:creator>
    <dcterms:title>MARK-AGE data management : Cleaning, exploration and visualization of data</dcterms:title>
    <dcterms:issued>2015-11</dcterms:issued>
    <dc:contributor>Berthold, Michael R.</dc:contributor>
    <dc:date rdf:datatype="http://www.w3.org/2001/XMLSchema#dateTime">2015-06-29T09:22:56Z</dc:date>
    <dc:contributor>Junk, Michael</dc:contributor>
    <dspace:isPartOfCollection rdf:resource="https://kops.uni-konstanz.de/server/rdf/resource/123456789/28"/>
    <bibo:uri rdf:resource="http://kops.uni-konstanz.de/handle/123456789/31306"/>
    <dcterms:isPartOf rdf:resource="https://kops.uni-konstanz.de/server/rdf/resource/123456789/36"/>
    <dcterms:isPartOf rdf:resource="https://kops.uni-konstanz.de/server/rdf/resource/123456789/28"/>
    <dcterms:rights rdf:resource="http://creativecommons.org/licenses/by-nc-nd/4.0/"/>
    <dcterms:hasPart rdf:resource="https://kops.uni-konstanz.de/bitstream/123456789/31306/1/Baur_0-295717.pdf"/>
    <void:sparqlEndpoint rdf:resource="http://localhost/fuseki/dspace/sparql"/>
    <dc:contributor>Moreno-Villanueva, Maria</dc:contributor>
    <dcterms:available rdf:datatype="http://www.w3.org/2001/XMLSchema#dateTime">2015-06-29T09:22:56Z</dcterms:available>
    <foaf:homepage rdf:resource="http://localhost:8080/"/>
    <dc:contributor>Baur, Jennifer</dc:contributor>
    <dspace:hasBitstream rdf:resource="https://kops.uni-konstanz.de/bitstream/123456789/31306/1/Baur_0-295717.pdf"/>
    <dspace:isPartOfCollection rdf:resource="https://kops.uni-konstanz.de/server/rdf/resource/123456789/36"/>
    <dc:language>eng</dc:language>
  </rdf:Description>
</rdf:RDF>
Interner Vermerk
xmlui.Submission.submit.DescribeStep.inputForms.label.kops_note_fromSubmitter
Kontakt
URL der Originalveröffentl.
Prüfdatum der URL
Prüfungsdatum der Dissertation
Finanzierungsart
Kommentar zur Publikation
Allianzlizenz
Corresponding Authors der Uni Konstanz vorhanden
Internationale Co-Autor:innen
Universitätsbibliographie
Ja
Begutachtet
Diese Publikation teilen