Type of Publication: | Journal article |
Publication status: | Published |
URI (citable link): | http://nbn-resolving.de/urn:nbn:de:bsz:352-2-1auzvjbeeo0k13 |
Author: | Sutton, Christopher; Boley, Mario; Ghiringhelli, Luca M.; Rupp, Matthias; Vreeken, Jilles; Scheffler, Matthias |
Year of publication: | 2020 |
Published in: | Nature communications ; 11 (2020), 1. - 4428. - eISSN 2041-1723 |
Pubmed ID: | 32887879 |
DOI (citable link): | https://dx.doi.org/10.1038/s41467-020-17112-9 |
Summary: |
Although machine learning (ML) models promise to substantially accelerate the discovery of novel materials, their performance is often still insufficient to draw reliable conclusions. Improved ML models are therefore actively researched, but their design is currently guided mainly by monitoring the average model test error. This can render different models indistinguishable although their performance differs substantially across materials, or it can make a model appear generally insufficient while it actually works well in specific sub-domains. Here, we present a method, based on subgroup discovery, for detecting domains of applicability (DA) of models within a materials class. The utility of this approach is demonstrated by analyzing three state-of-the-art ML models for predicting the formation energy of transparent conducting oxides. We find that, despite having a mutually indistinguishable and unsatisfactory average error, the models have DAs with distinctive features and notably improved performance.
|
Subject (DDC): | 004 Computer Science |
Link to License: | Attribution 4.0 International |
Bibliography of Konstanz: | Yes |
Refereed: | Yes |
SUTTON, Christopher, Mario BOLEY, Luca M. GHIRINGHELLI, Matthias RUPP, Jilles VREEKEN, Matthias SCHEFFLER, 2020. Identifying domains of applicability of machine learning models for materials science. In: Nature communications. 11(1), 4428. eISSN 2041-1723. Available under: doi: 10.1038/s41467-020-17112-9
@article{Sutton2020Ident-51258, title={Identifying domains of applicability of machine learning models for materials science}, year={2020}, doi={10.1038/s41467-020-17112-9}, number={1}, volume={11}, journal={Nature communications}, author={Sutton, Christopher and Boley, Mario and Ghiringhelli, Luca M. and Rupp, Matthias and Vreeken, Jilles and Scheffler, Matthias}, note={Article Number: 4428} }
<rdf:RDF xmlns:dcterms="http://purl.org/dc/terms/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:bibo="http://purl.org/ontology/bibo/" xmlns:dspace="http://digital-repositories.org/ontologies/dspace/0.1.0#" xmlns:foaf="http://xmlns.com/foaf/0.1/" xmlns:void="http://rdfs.org/ns/void#" xmlns:xsd="http://www.w3.org/2001/XMLSchema#" > <rdf:Description rdf:about="https://kops.uni-konstanz.de/rdf/resource/123456789/51258"> <dc:rights>Attribution 4.0 International</dc:rights> <dc:contributor>Ghiringhelli, Luca M.</dc:contributor> <dc:contributor>Sutton, Christopher</dc:contributor> <dc:creator>Boley, Mario</dc:creator> <dcterms:hasPart rdf:resource="https://kops.uni-konstanz.de/bitstream/123456789/51258/1/Sutton_2-1auzvjbeeo0k13.pdf"/> <void:sparqlEndpoint rdf:resource="http://localhost/fuseki/dspace/sparql"/> <dc:contributor>Rupp, Matthias</dc:contributor> <dc:contributor>Scheffler, Matthias</dc:contributor> <dcterms:available rdf:datatype="http://www.w3.org/2001/XMLSchema#dateTime">2020-10-07T14:05:45Z</dcterms:available> <dc:date rdf:datatype="http://www.w3.org/2001/XMLSchema#dateTime">2020-10-07T14:05:45Z</dc:date> <bibo:uri rdf:resource="https://kops.uni-konstanz.de/handle/123456789/51258"/> <dcterms:title>Identifying domains of applicability of machine learning models for materials science</dcterms:title> <dcterms:issued>2020</dcterms:issued> <dc:contributor>Boley, Mario</dc:contributor> <dc:creator>Ghiringhelli, Luca M.</dc:creator> <dc:contributor>Vreeken, Jilles</dc:contributor> <dc:creator>Vreeken, Jilles</dc:creator> <dspace:isPartOfCollection rdf:resource="https://kops.uni-konstanz.de/rdf/resource/123456789/36"/> <foaf:homepage rdf:resource="http://localhost:8080/jspui"/> <dcterms:rights rdf:resource="http://creativecommons.org/licenses/by/4.0/"/> <dcterms:isPartOf rdf:resource="https://kops.uni-konstanz.de/rdf/resource/123456789/36"/> <dspace:hasBitstream rdf:resource="https://kops.uni-konstanz.de/bitstream/123456789/51258/1/Sutton_2-1auzvjbeeo0k13.pdf"/> <dc:creator>Scheffler, Matthias</dc:creator> <dc:creator>Rupp, Matthias</dc:creator> <dc:creator>Sutton, Christopher</dc:creator> <dcterms:abstract xml:lang="eng">Although machine learning (ML) models promise to substantially accelerate the discovery of novel materials, their performance is often still insufficient to draw reliable conclusions. Improved ML models are therefore actively researched, but their design is currently guided mainly by monitoring the average model test error. This can render different models indistinguishable although their performance differs substantially across materials, or it can make a model appear generally insufficient while it actually works well in specific sub-domains. Here, we present a method, based on subgroup discovery, for detecting domains of applicability (DA) of models within a materials class. The utility of this approach is demonstrated by analyzing three state-of-the-art ML models for predicting the formation energy of transparent conducting oxides. We find that, despite having a mutually indistinguishable and unsatisfactory average error, the models have DAs with distinctive features and notably improved performance.</dcterms:abstract> <dc:language>eng</dc:language> </rdf:Description> </rdf:RDF>
Sutton_2-1auzvjbeeo0k13.pdf | 51 |