Publikation: Data-driven approximate dynamic programming : A linear programming approach
Dateien
Datum
Autor:innen
Herausgeber:innen
ISSN der Zeitschrift
Electronic ISSN
ISBN
Bibliografische Daten
Verlag
Schriftenreihe
Auflagebezeichnung
DOI (zitierfähiger Link)
Internationale Patentnummer
Angaben zur Forschungsförderung
Projekt
Open Access-Veröffentlichung
Core Facility der Universität Konstanz
Titel in einer weiteren Sprache
Publikationstyp
Publikationsstatus
Erschienen in
Zusammenfassung
This article presents an approximation scheme for the infinite-dimensional linear programming formulation of discrete-time Markov control processes via a finite-dimensional convex program, when the dynamics are unknown and learned from data. We derive a probabilistic explicit error bound between the data-driven finite convex program and the original infinite linear program. We further discuss the sample complexity of the error bound which translates to the number of samples required for an a priori approximation accuracy. Our analysis sheds light on the impact of the choice of basis functions for approximating the true value function. Finally, the relevance of the method is illustrated on a truncated LQG problem.
Zusammenfassung in einer weiteren Sprache
Fachgebiet (DDC)
Schlagwörter
Konferenz
Rezension
Zitieren
ISO 690
SUTTER, Tobias, Angeliki KAMOUTSI, Peyman Mohajerin ESFAHANI, John LYGEROS, 2017. Data-driven approximate dynamic programming : A linear programming approach. IEEE 56th Annual Conference on Decision and Control (CDC). Melbourne, Australia, 12. Dez. 2017 - 15. Dez. 2017. In: 2017 IEEE 56th Annual Conference on Decision and Control (CDC). Piscataway, NJ: IEEE, 2017, pp. 5174-5179. ISBN 978-1-5090-2873-3. Available under: doi: 10.1109/CDC.2017.8264426BibTex
@inproceedings{Sutter2017Datad-55738, year={2017}, doi={10.1109/CDC.2017.8264426}, title={Data-driven approximate dynamic programming : A linear programming approach}, isbn={978-1-5090-2873-3}, publisher={IEEE}, address={Piscataway, NJ}, booktitle={2017 IEEE 56th Annual Conference on Decision and Control (CDC)}, pages={5174--5179}, author={Sutter, Tobias and Kamoutsi, Angeliki and Esfahani, Peyman Mohajerin and Lygeros, John} }
RDF
<rdf:RDF xmlns:dcterms="http://purl.org/dc/terms/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:bibo="http://purl.org/ontology/bibo/" xmlns:dspace="http://digital-repositories.org/ontologies/dspace/0.1.0#" xmlns:foaf="http://xmlns.com/foaf/0.1/" xmlns:void="http://rdfs.org/ns/void#" xmlns:xsd="http://www.w3.org/2001/XMLSchema#" > <rdf:Description rdf:about="https://kops.uni-konstanz.de/server/rdf/resource/123456789/55738"> <dc:contributor>Sutter, Tobias</dc:contributor> <dspace:isPartOfCollection rdf:resource="https://kops.uni-konstanz.de/server/rdf/resource/123456789/36"/> <dc:contributor>Esfahani, Peyman Mohajerin</dc:contributor> <dcterms:available rdf:datatype="http://www.w3.org/2001/XMLSchema#dateTime">2021-12-02T12:46:29Z</dcterms:available> <dcterms:isPartOf rdf:resource="https://kops.uni-konstanz.de/server/rdf/resource/123456789/36"/> <dcterms:rights rdf:resource="https://rightsstatements.org/page/InC/1.0/"/> <dc:creator>Lygeros, John</dc:creator> <bibo:uri rdf:resource="https://kops.uni-konstanz.de/handle/123456789/55738"/> <dc:creator>Kamoutsi, Angeliki</dc:creator> <dc:creator>Esfahani, Peyman Mohajerin</dc:creator> <dc:date rdf:datatype="http://www.w3.org/2001/XMLSchema#dateTime">2021-12-02T12:46:29Z</dc:date> <dc:language>eng</dc:language> <dcterms:issued>2017</dcterms:issued> <foaf:homepage rdf:resource="http://localhost:8080/"/> <dc:creator>Sutter, Tobias</dc:creator> <dc:rights>terms-of-use</dc:rights> <void:sparqlEndpoint rdf:resource="http://localhost/fuseki/dspace/sparql"/> <dcterms:abstract xml:lang="eng">This article presents an approximation scheme for the infinite-dimensional linear programming formulation of discrete-time Markov control processes via a finite-dimensional convex program, when the dynamics are unknown and learned from data. We derive a probabilistic explicit error bound between the data-driven finite convex program and the original infinite linear program. We further discuss the sample complexity of the error bound which translates to the number of samples required for an a priori approximation accuracy. Our analysis sheds light on the impact of the choice of basis functions for approximating the true value function. Finally, the relevance of the method is illustrated on a truncated LQG problem.</dcterms:abstract> <dc:contributor>Lygeros, John</dc:contributor> <dc:contributor>Kamoutsi, Angeliki</dc:contributor> <dcterms:title>Data-driven approximate dynamic programming : A linear programming approach</dcterms:title> </rdf:Description> </rdf:RDF>