Publikation:

Identifying domains of applicability of machine learning models for materials science

Lade...
Vorschaubild

Dateien

Sutton_2-1auzvjbeeo0k13.pdf
Sutton_2-1auzvjbeeo0k13.pdfGröße: 5.29 MBDownloads: 351

Datum

2020

Autor:innen

Sutton, Christopher
Boley, Mario
Ghiringhelli, Luca M.
Vreeken, Jilles
Scheffler, Matthias

Herausgeber:innen

Kontakt

ISSN der Zeitschrift

Electronic ISSN

ISBN

Bibliografische Daten

Verlag

Schriftenreihe

Auflagebezeichnung

ArXiv-ID

Internationale Patentnummer

Link zur Lizenz

Angaben zur Forschungsförderung

Projekt

Open Access-Veröffentlichung
Open Access Gold
Core Facility der Universität Konstanz

Gesperrt bis

Titel in einer weiteren Sprache

Publikationstyp
Zeitschriftenartikel
Publikationsstatus
Published

Erschienen in

Nature communications. Nature Publishing Group. 2020, 11(1), 4428. eISSN 2041-1723. Available under: doi: 10.1038/s41467-020-17112-9

Zusammenfassung

Although machine learning (ML) models promise to substantially accelerate the discovery of novel materials, their performance is often still insufficient to draw reliable conclusions. Improved ML models are therefore actively researched, but their design is currently guided mainly by monitoring the average model test error. This can render different models indistinguishable although their performance differs substantially across materials, or it can make a model appear generally insufficient while it actually works well in specific sub-domains. Here, we present a method, based on subgroup discovery, for detecting domains of applicability (DA) of models within a materials class. The utility of this approach is demonstrated by analyzing three state-of-the-art ML models for predicting the formation energy of transparent conducting oxides. We find that, despite having a mutually indistinguishable and unsatisfactory average error, the models have DAs with distinctive features and notably improved performance.

Zusammenfassung in einer weiteren Sprache

Fachgebiet (DDC)
004 Informatik

Schlagwörter

Konferenz

Rezension
undefined / . - undefined, undefined

Forschungsvorhaben

Organisationseinheiten

Zeitschriftenheft

Zugehörige Datensätze in KOPS

Zitieren

ISO 690SUTTON, Christopher, Mario BOLEY, Luca M. GHIRINGHELLI, Matthias RUPP, Jilles VREEKEN, Matthias SCHEFFLER, 2020. Identifying domains of applicability of machine learning models for materials science. In: Nature communications. Nature Publishing Group. 2020, 11(1), 4428. eISSN 2041-1723. Available under: doi: 10.1038/s41467-020-17112-9
BibTex
@article{Sutton2020Ident-51258,
  year={2020},
  doi={10.1038/s41467-020-17112-9},
  title={Identifying domains of applicability of machine learning models for materials science},
  number={1},
  volume={11},
  journal={Nature communications},
  author={Sutton, Christopher and Boley, Mario and Ghiringhelli, Luca M. and Rupp, Matthias and Vreeken, Jilles and Scheffler, Matthias},
  note={Article Number: 4428}
}
RDF
<rdf:RDF
    xmlns:dcterms="http://purl.org/dc/terms/"
    xmlns:dc="http://purl.org/dc/elements/1.1/"
    xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
    xmlns:bibo="http://purl.org/ontology/bibo/"
    xmlns:dspace="http://digital-repositories.org/ontologies/dspace/0.1.0#"
    xmlns:foaf="http://xmlns.com/foaf/0.1/"
    xmlns:void="http://rdfs.org/ns/void#"
    xmlns:xsd="http://www.w3.org/2001/XMLSchema#" > 
  <rdf:Description rdf:about="https://kops.uni-konstanz.de/server/rdf/resource/123456789/51258">
    <dc:contributor>Sutton, Christopher</dc:contributor>
    <dcterms:rights rdf:resource="http://creativecommons.org/licenses/by/4.0/"/>
    <dspace:hasBitstream rdf:resource="https://kops.uni-konstanz.de/bitstream/123456789/51258/1/Sutton_2-1auzvjbeeo0k13.pdf"/>
    <dc:creator>Ghiringhelli, Luca M.</dc:creator>
    <dc:creator>Vreeken, Jilles</dc:creator>
    <dc:contributor>Rupp, Matthias</dc:contributor>
    <dc:language>eng</dc:language>
    <dcterms:available rdf:datatype="http://www.w3.org/2001/XMLSchema#dateTime">2020-10-07T14:05:45Z</dcterms:available>
    <dc:date rdf:datatype="http://www.w3.org/2001/XMLSchema#dateTime">2020-10-07T14:05:45Z</dc:date>
    <foaf:homepage rdf:resource="http://localhost:8080/"/>
    <void:sparqlEndpoint rdf:resource="http://localhost/fuseki/dspace/sparql"/>
    <dc:rights>Attribution 4.0 International</dc:rights>
    <dc:contributor>Ghiringhelli, Luca M.</dc:contributor>
    <dcterms:isPartOf rdf:resource="https://kops.uni-konstanz.de/server/rdf/resource/123456789/36"/>
    <dspace:isPartOfCollection rdf:resource="https://kops.uni-konstanz.de/server/rdf/resource/123456789/36"/>
    <dc:creator>Rupp, Matthias</dc:creator>
    <dc:contributor>Boley, Mario</dc:contributor>
    <dc:contributor>Scheffler, Matthias</dc:contributor>
    <dcterms:abstract xml:lang="eng">Although machine learning (ML) models promise to substantially accelerate the discovery of novel materials, their performance is often still insufficient to draw reliable conclusions. Improved ML models are therefore actively researched, but their design is currently guided mainly by monitoring the average model test error. This can render different models indistinguishable although their performance differs substantially across materials, or it can make a model appear generally insufficient while it actually works well in specific sub-domains. Here, we present a method, based on subgroup discovery, for detecting domains of applicability (DA) of models within a materials class. The utility of this approach is demonstrated by analyzing three state-of-the-art ML models for predicting the formation energy of transparent conducting oxides. We find that, despite having a mutually indistinguishable and unsatisfactory average error, the models have DAs with distinctive features and notably improved performance.</dcterms:abstract>
    <dc:creator>Scheffler, Matthias</dc:creator>
    <dcterms:issued>2020</dcterms:issued>
    <bibo:uri rdf:resource="https://kops.uni-konstanz.de/handle/123456789/51258"/>
    <dc:contributor>Vreeken, Jilles</dc:contributor>
    <dc:creator>Sutton, Christopher</dc:creator>
    <dc:creator>Boley, Mario</dc:creator>
    <dcterms:hasPart rdf:resource="https://kops.uni-konstanz.de/bitstream/123456789/51258/1/Sutton_2-1auzvjbeeo0k13.pdf"/>
    <dcterms:title>Identifying domains of applicability of machine learning models for materials science</dcterms:title>
  </rdf:Description>
</rdf:RDF>

Interner Vermerk

xmlui.Submission.submit.DescribeStep.inputForms.label.kops_note_fromSubmitter

Kontakt
URL der Originalveröffentl.

Prüfdatum der URL

Prüfungsdatum der Dissertation

Finanzierungsart

Kommentar zur Publikation

Allianzlizenz
Corresponding Authors der Uni Konstanz vorhanden
Internationale Co-Autor:innen
Universitätsbibliographie
Ja
Begutachtet
Ja
Diese Publikation teilen