Publikation:

Data Analysis in the Life Sciences : Sparking Ideas

Lade...
Vorschaubild

Dateien

Bert05_sparkingideas_ecmlpkdd.pdf
Bert05_sparkingideas_ecmlpkdd.pdfGröße: 88.63 KBDownloads: 296

Datum

2005

Herausgeber:innen

Kontakt

ISSN der Zeitschrift

Electronic ISSN

ISBN

Bibliografische Daten

Verlag

Schriftenreihe

Auflagebezeichnung

DOI (zitierfähiger Link)
ArXiv-ID

Internationale Patentnummer

Angaben zur Forschungsförderung

Projekt

Open Access-Veröffentlichung
Open Access Green
Core Facility der Universität Konstanz

Gesperrt bis

Titel in einer weiteren Sprache

Publikationstyp
Beitrag zu einem Konferenzband
Publikationsstatus
Published

Erschienen in

GAMA, João, ed., Rui CAMACHO, ed., Pavel B. BRAZDIL, ed., Alípio Mário JORGE, ed., Luís TORGO, ed.. Machine Learning: ECML 2005. Berlin, Heidelberg: Springer Berlin Heidelberg, 2005, pp. 1-1. Lecture Notes in Computer Science. 3720. ISBN 978-3-540-29243-2. Available under: doi: 10.1007/11564096_1

Zusammenfassung

Data from various areas of Life Sciences have increasingly caught the attention of data mining and machine learning researchers. Not only is the amount of data available mind-boggling but the diverse and heterogenous nature of the information is far beyond any other data analysis problem so far. In sharp contrast to classical data analysis scenarios, the life science area poses challenges of a rather different nature for mainly two reasons. Firstly, the available data stems from heterogenous information sources of varying degrees of reliability and quality and is, without the interactive, constant interpretation of a domain expert, not useful. Furthermore, predictive models are of only marginal interest to those users instead they hope for new insights into a complex, biological system that is only partially represented within that data anyway. In this scenario, the data serves mainly to create new insights and generate new ideas that can be tested. Secondly, the notion of feature space and the accompanying measures of similarity cannot be taken for granted. Similarity measures become context dependent and it is often the case that within one analysis task several different ways of describing the objects of interest or measuring similarity between them matter. Some more recently published work in the data analysis area has started to address some of these issues. For example, data analysis in parallel universes [1], that is, the detection of patterns of interest in various different descriptor spaces at the same time, and mining of frequent, discriminative fragments in large, molecular data bases [2]. In both cases, sheer numerical performance is not the focus; it is rather the discovery of interpretable pieces of evidence that lights up new ideas in the users mind. Future work in data analysis in the life sciences needs to keep this in mind: the goal is to trigger new ideas and stimulate interesting associations.

Zusammenfassung in einer weiteren Sprache

Fachgebiet (DDC)
004 Informatik

Schlagwörter

Konferenz

Rezension
undefined / . - undefined, undefined

Forschungsvorhaben

Organisationseinheiten

Zeitschriftenheft

Zugehörige Datensätze in KOPS

Zitieren

ISO 690BERTHOLD, Michael R., 2005. Data Analysis in the Life Sciences : Sparking Ideas. In: GAMA, João, ed., Rui CAMACHO, ed., Pavel B. BRAZDIL, ed., Alípio Mário JORGE, ed., Luís TORGO, ed.. Machine Learning: ECML 2005. Berlin, Heidelberg: Springer Berlin Heidelberg, 2005, pp. 1-1. Lecture Notes in Computer Science. 3720. ISBN 978-3-540-29243-2. Available under: doi: 10.1007/11564096_1
BibTex
@inproceedings{Berthold2005Analy-5636,
  year={2005},
  doi={10.1007/11564096_1},
  title={Data Analysis in the Life Sciences : Sparking Ideas},
  number={3720},
  isbn={978-3-540-29243-2},
  publisher={Springer Berlin Heidelberg},
  address={Berlin, Heidelberg},
  series={Lecture Notes in Computer Science},
  booktitle={Machine Learning: ECML 2005},
  pages={1--1},
  editor={Gama, João and Camacho, Rui and Brazdil, Pavel B. and Jorge, Alípio Mário and Torgo, Luís},
  author={Berthold, Michael R.}
}
RDF
<rdf:RDF
    xmlns:dcterms="http://purl.org/dc/terms/"
    xmlns:dc="http://purl.org/dc/elements/1.1/"
    xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
    xmlns:bibo="http://purl.org/ontology/bibo/"
    xmlns:dspace="http://digital-repositories.org/ontologies/dspace/0.1.0#"
    xmlns:foaf="http://xmlns.com/foaf/0.1/"
    xmlns:void="http://rdfs.org/ns/void#"
    xmlns:xsd="http://www.w3.org/2001/XMLSchema#" > 
  <rdf:Description rdf:about="https://kops.uni-konstanz.de/server/rdf/resource/123456789/5636">
    <dcterms:title>Data Analysis in the Life Sciences : Sparking Ideas</dcterms:title>
    <dc:creator>Berthold, Michael R.</dc:creator>
    <dc:format>application/pdf</dc:format>
    <dcterms:available rdf:datatype="http://www.w3.org/2001/XMLSchema#dateTime">2011-03-24T15:57:21Z</dcterms:available>
    <dspace:isPartOfCollection rdf:resource="https://kops.uni-konstanz.de/server/rdf/resource/123456789/36"/>
    <dcterms:abstract xml:lang="eng">Data from various areas of Life Sciences have increasingly caught the attention of data mining and machine learning researchers. Not only is the amount of data available mind-boggling but the diverse and heterogenous nature of the information is far beyond any other data analysis problem so far. In sharp contrast to classical data analysis scenarios, the life science area poses challenges of a rather different nature for mainly two reasons. Firstly, the available data stems from heterogenous information sources of varying degrees of reliability and quality and is, without the interactive, constant interpretation of a domain expert, not useful. Furthermore, predictive models are of only marginal interest to those users   instead they hope for new insights into a complex, biological system that is only partially represented within that data anyway. In this scenario, the data serves mainly to create new insights and generate new ideas that can be tested. Secondly, the notion of feature space and the accompanying measures of similarity cannot be taken for granted. Similarity measures become context dependent and it is often the case that within one analysis task several different ways of describing the objects of interest or measuring similarity between them matter. Some more recently published work in the data analysis area has started to address some of these issues. For example, data analysis in parallel universes [1], that is, the detection of patterns of interest in various different descriptor spaces at the same time, and mining of frequent, discriminative fragments in large, molecular data bases [2]. In both cases, sheer numerical performance is not the focus; it is rather the discovery of interpretable pieces of evidence that lights up new ideas in the users mind. Future work in data analysis in the life sciences needs to keep this in mind: the goal is to trigger new ideas and stimulate interesting associations.</dcterms:abstract>
    <dc:contributor>Berthold, Michael R.</dc:contributor>
    <foaf:homepage rdf:resource="http://localhost:8080/"/>
    <dcterms:isPartOf rdf:resource="https://kops.uni-konstanz.de/server/rdf/resource/123456789/36"/>
    <void:sparqlEndpoint rdf:resource="http://localhost/fuseki/dspace/sparql"/>
    <dcterms:issued>2005</dcterms:issued>
    <dcterms:bibliographicCitation>First publ. in: Machine Learning. 16th European Conference on Machine Learning, Porto, Portugal, October 3-7, 2005. Proceedings. Berlin: Springer, 2005, p.1</dcterms:bibliographicCitation>
    <dcterms:rights rdf:resource="http://creativecommons.org/licenses/by-nc-nd/2.0/"/>
    <bibo:uri rdf:resource="http://kops.uni-konstanz.de/handle/123456789/5636"/>
    <dc:language>eng</dc:language>
    <dspace:hasBitstream rdf:resource="https://kops.uni-konstanz.de/bitstream/123456789/5636/1/Bert05_sparkingideas_ecmlpkdd.pdf"/>
    <dc:rights>Attribution-NonCommercial-NoDerivs 2.0 Generic</dc:rights>
    <dcterms:hasPart rdf:resource="https://kops.uni-konstanz.de/bitstream/123456789/5636/1/Bert05_sparkingideas_ecmlpkdd.pdf"/>
    <dc:date rdf:datatype="http://www.w3.org/2001/XMLSchema#dateTime">2011-03-24T15:57:21Z</dc:date>
  </rdf:Description>
</rdf:RDF>

Interner Vermerk

xmlui.Submission.submit.DescribeStep.inputForms.label.kops_note_fromSubmitter

Kontakt
URL der Originalveröffentl.

Prüfdatum der URL

Prüfungsdatum der Dissertation

Finanzierungsart

Kommentar zur Publikation

Allianzlizenz
Corresponding Authors der Uni Konstanz vorhanden
Internationale Co-Autor:innen
Universitätsbibliographie
Ja
Begutachtet
Diese Publikation teilen