A multi modal approach to gesture recognition from audio and video data
A multi modal approach to gesture recognition from audio and video data
No Thumbnail Available
Files
There are no files associated with this item.
Date
2013
Editors
Journal ISSN
Electronic ISSN
ISBN
Bibliographical data
Publisher
Series
URI (citable link)
DOI (citable link)
International patent number
Link to the license
EU project number
Project
Open Access publication
Collections
Title in another language
Publication type
Contribution to a conference collection
Publication status
Published in
Proceedings of the 15th ACM on International conference on multimodal interaction - ICMI '13. - New York, New York, USA : ACM Press, 2013. - pp. 461-466. - ISBN 978-1-4503-2129-7
Abstract
We describe in this paper our approach for the Multi-modal gesture recognition challenge organized by ChaLearn in conjunction with the ICMI 2013 conference. The competition's task was to learn a vocabulary of 20 types of Italian gestures performed from different persons and to detect them in sequences. We develop an algorithm to find the gesture intervals in the audio data, extract audio features from those intervals and train two different models. We engineer features from the skeleton data and use the gesture intervals in the training data to train a model that we afterwards apply to the test sequences using a sliding window. We combine the models through weighted averaging. We find that this way to combine information from two different sources boosts the models performance significantly.
Summary in another language
Subject (DDC)
004 Computer Science
Keywords
Conference
the 15th ACM, Dec 9, 2013 - Dec 13, 2013, Sydney, Australia
Review
undefined / . - undefined, undefined. - (undefined; undefined)
Cite This
ISO 690
BAYER, Immanuel, Thierry SILBERMANN, 2013. A multi modal approach to gesture recognition from audio and video data. the 15th ACM. Sydney, Australia, Dec 9, 2013 - Dec 13, 2013. In: Proceedings of the 15th ACM on International conference on multimodal interaction - ICMI '13. New York, New York, USA:ACM Press, pp. 461-466. ISBN 978-1-4503-2129-7. Available under: doi: 10.1145/2522848.2532592BibTex
@inproceedings{Bayer2013multi-26475, year={2013}, doi={10.1145/2522848.2532592}, title={A multi modal approach to gesture recognition from audio and video data}, isbn={978-1-4503-2129-7}, publisher={ACM Press}, address={New York, New York, USA}, booktitle={Proceedings of the 15th ACM on International conference on multimodal interaction - ICMI '13}, pages={461--466}, author={Bayer, Immanuel and Silbermann, Thierry} }
RDF
<rdf:RDF xmlns:dcterms="http://purl.org/dc/terms/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:bibo="http://purl.org/ontology/bibo/" xmlns:dspace="http://digital-repositories.org/ontologies/dspace/0.1.0#" xmlns:foaf="http://xmlns.com/foaf/0.1/" xmlns:void="http://rdfs.org/ns/void#" xmlns:xsd="http://www.w3.org/2001/XMLSchema#" > <rdf:Description rdf:about="https://kops.uni-konstanz.de/server/rdf/resource/123456789/26475"> <dspace:isPartOfCollection rdf:resource="https://kops.uni-konstanz.de/server/rdf/resource/123456789/36"/> <bibo:uri rdf:resource="http://kops.uni-konstanz.de/handle/123456789/26475"/> <dc:language>eng</dc:language> <foaf:homepage rdf:resource="http://localhost:8080/"/> <dc:rights>terms-of-use</dc:rights> <dc:creator>Silbermann, Thierry</dc:creator> <dcterms:title>A multi modal approach to gesture recognition from audio and video data</dcterms:title> <dcterms:issued>2013</dcterms:issued> <dcterms:bibliographicCitation>Proceedings of the 15th ACM International conference on multimodal interaction : Sydney, NSW, Australia ; December 09 - 13, 2013 / Julien Epps ... (eds.). - New York : ACM, 2013. - S. 461-466. - ISBN 978-1-4503-2129-7</dcterms:bibliographicCitation> <dcterms:available rdf:datatype="http://www.w3.org/2001/XMLSchema#dateTime">2014-02-25T09:52:37Z</dcterms:available> <dc:contributor>Silbermann, Thierry</dc:contributor> <dc:contributor>Bayer, Immanuel</dc:contributor> <dc:creator>Bayer, Immanuel</dc:creator> <void:sparqlEndpoint rdf:resource="http://localhost/fuseki/dspace/sparql"/> <dcterms:isPartOf rdf:resource="https://kops.uni-konstanz.de/server/rdf/resource/123456789/36"/> <dcterms:rights rdf:resource="https://rightsstatements.org/page/InC/1.0/"/> <dc:date rdf:datatype="http://www.w3.org/2001/XMLSchema#dateTime">2014-02-25T09:52:37Z</dc:date> <dcterms:abstract xml:lang="eng">We describe in this paper our approach for the Multi-modal gesture recognition challenge organized by ChaLearn in conjunction with the ICMI 2013 conference. The competition's task was to learn a vocabulary of 20 types of Italian gestures performed from different persons and to detect them in sequences. We develop an algorithm to find the gesture intervals in the audio data, extract audio features from those intervals and train two different models. We engineer features from the skeleton data and use the gesture intervals in the training data to train a model that we afterwards apply to the test sequences using a sliding window. We combine the models through weighted averaging. We find that this way to combine information from two different sources boosts the models performance significantly.</dcterms:abstract> </rdf:Description> </rdf:RDF>
Internal note
xmlui.Submission.submit.DescribeStep.inputForms.label.kops_note_fromSubmitter
Examination date of dissertation
Method of financing
Comment on publication
Alliance license
Corresponding Authors der Uni Konstanz vorhanden
International Co-Authors
Bibliography of Konstanz
Yes