DCASE 2021 Task 5: Few-shot Bioacoustic Event Detection Development Set
| dc.contributor.author | Morfi, Veronica | |
| dc.contributor.author | Stowell, Dan | |
| dc.contributor.author | Lostanlen, Vincent | |
| dc.contributor.author | Strandburg-Peshkin, Ariana | |
| dc.contributor.author | Gill, Lisa | |
| dc.contributor.author | Pamula, Hanna | |
| dc.contributor.author | Benvent, David | |
| dc.contributor.author | Nolasco, Ines | |
| dc.contributor.author | Singh, Shubhr | |
| dc.contributor.author | Sridhar, Sripathi | |
| dc.contributor.author | Duteil, Mathieu | |
| dc.contributor.author | Farnsworth, Andrew | |
| dc.date.accessioned | 2025-07-03T10:51:56Z | |
| dc.date.available | 2025-07-03T10:51:56Z | |
| dc.date.created | 2021-02-19T10:33:15Z | |
| dc.date.issued | 2021 | |
| dc.description.abstract | General Description The development set for task 5 of DCASE 2021 "Few-shot Bioacoustic Event Detection" consists of 19 audio files acquired from different bioacoustic sources. The dataset is split into training and validation Sets. Multi-class annotations are provided for the training set with positive (POS), negative (NEG) and unkwown (UNK) values for each class. UNK indicates uncertainty about a class. Single-class (class of interest) annotations are provided for the validation set, with events marked as positive (POS) or unkwown (UNK) provided for the class of interest. Folder Structure Development_Set.zip |_Development_Set/ |__Training_Set/ |___BV/ |____*.wav |____*.csv |___HT/ |____*.wav |____*.csv |___JD/ |____*.wav |____*.csv |___MT/ |____*.wav |____*.csv |__Validation_Set/ |___HV/ |____*.wav |____*.csv |___PB/ |____*.wav |____*.csv Development_Set_Audio.zip has the same structure but contains only the *.wav files. Development_Set_Annotations.zip has the same structure but contains only the *.csv files Dataset statistics Some statistics on this dataset are as follows, split between training and validation set and their sub-folders: ----------------------------------------------------- TRAINING SET ----------------------------------------------------- Number of audio recordings | 11 Total duration | 14 hours and 20 mins Total classes (excl. UNK) | 19 Total events (excl. UNK) | 4,686 ----------------------------------------------------- TRAINING SET/BV ----------------------------------------------------- Number of audio recordings | 5 Total duration | 10 hours Total classes (excl. UNK) | 11 Total events (excl. UNK) | 2,662 Sampling rate | 24,000 Hz ----------------------------------------------------- TRAINING SET/HT ----------------------------------------------------- Number of audio recordings | 3 Total duration | 3 hours Total classes (excl. UNK) | 3 Total events (excl. UNK) | 435 Sampling rate | 6,000 Hz ----------------------------------------------------- TRAINING SET/JD ----------------------------------------------------- Number of audio recordings | 1 Total duration | 10 mins Total classes (excl. UNK) | 1 Total events (excl. UNK) | 355 Sampling rate | 22,050 Hz ----------------------------------------------------- TRAINING SET/MT ----------------------------------------------------- Number of audio recordings | 2 Total duration | 1 hour and 10 mins Total classes (excl. UNK) | 4 Total events (excl. UNK) | 1,234 Sampling rate | 8,000 Hz ----------------------------------------------------- ----------------------------------------------------- VALIDATION SET ----------------------------------------------------- Number of audio recordings | 8 Total duration | 5 hours Total classes (excl. UNK) | 4 Total events (excl. UNK) | 310 ----------------------------------------------------- VALIDATION SET/HV ----------------------------------------------------- Number of audio recordings | 2 Total duration | 2 hours Total classes (excl. UNK) | 2 Total events (excl. UNK) | 50 Sampling rate | 6,000 Hz ----------------------------------------------------- VALIDATION SET/PB ----------------------------------------------------- Number of audio recordings | 6 Total duration | 3 hours Total classes (excl. UNK) | 2 Total events (excl. UNK) | 260 Sampling rate | 44,100 Hz ----------------------------------------------------- Annotation structure Each line of the annotation csv represents an event in the audio file. The column descriptions are as follows: TRAINING SET --------------------- Audiofilename, Starttime, Endtime, CLASS_1, CLASS_2, ...CLASS_N VALIDATION SET --------------------- Audiofilename, Starttime, Endtime, Q Classes DCASE2021_task5_training_set_classes.csv and DCASE2021_task5_validation_set_classes.csv provide a table with class code correspondace to class name for all classes in the Development set. DCASE2021_task5_training_set_classes.csv --------------------- dataset, class_code, class_name DCASE2021_task5_validation_set_classes.csv --------------------- dataset, recording, class_code, class_name Evaluation Set The Evaluation set for the same task can be found at: https://doi.org/10.5281/zenodo.5413149 Open Access This dataset is available under a Creative Commons Attribution 4.0 International (CC BY 4.0) license. Contact info Please send any feedback or questions to: Veronica Morfi: g.v.morfi@qmul.ac.uk | |
| dc.description.version | published | deu |
| dc.identifier.doi | 10.5281/zenodo.4543503 | |
| dc.identifier.uri | https://kops.uni-konstanz.de/handle/123456789/73794 | |
| dc.language.iso | eng | |
| dc.rights | Creative Commons Attribution 4.0 International | |
| dc.rights.uri | https://creativecommons.org/licenses/by/4.0/legalcode | |
| dc.subject | bioacoustics | |
| dc.subject | few-shot learning | |
| dc.subject | dcase2021 | |
| dc.subject | audio event detection | |
| dc.subject.ddc | 570 | |
| dc.title | DCASE 2021 Task 5: Few-shot Bioacoustic Event Detection Development Set | eng |
| dspace.entity.type | Dataset | |
| kops.citation.bibtex | ||
| kops.citation.iso690 | MORFI, Veronica, Dan STOWELL, Vincent LOSTANLEN, Ariana STRANDBURG-PESHKIN, Lisa GILL, Hanna PAMULA, David BENVENT, Ines NOLASCO, Shubhr SINGH, Sripathi SRIDHAR, Mathieu DUTEIL, Andrew FARNSWORTH, 2021. DCASE 2021 Task 5: Few-shot Bioacoustic Event Detection Development Set | deu |
| kops.citation.iso690 | MORFI, Veronica, Dan STOWELL, Vincent LOSTANLEN, Ariana STRANDBURG-PESHKIN, Lisa GILL, Hanna PAMULA, David BENVENT, Ines NOLASCO, Shubhr SINGH, Sripathi SRIDHAR, Mathieu DUTEIL, Andrew FARNSWORTH, 2021. DCASE 2021 Task 5: Few-shot Bioacoustic Event Detection Development Set | eng |
| kops.citation.rdf | <rdf:RDF
xmlns:dcterms="http://purl.org/dc/terms/"
xmlns:dc="http://purl.org/dc/elements/1.1/"
xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
xmlns:bibo="http://purl.org/ontology/bibo/"
xmlns:dspace="http://digital-repositories.org/ontologies/dspace/0.1.0#"
xmlns:foaf="http://xmlns.com/foaf/0.1/"
xmlns:void="http://rdfs.org/ns/void#"
xmlns:xsd="http://www.w3.org/2001/XMLSchema#" >
<rdf:Description rdf:about="https://kops.uni-konstanz.de/server/rdf/resource/123456789/73794">
<dc:creator>Nolasco, Ines</dc:creator>
<dc:contributor>Gill, Lisa</dc:contributor>
<dc:creator>Pamula, Hanna</dc:creator>
<dc:creator>Benvent, David</dc:creator>
<dc:creator>Sridhar, Sripathi</dc:creator>
<dcterms:isPartOf rdf:resource="https://kops.uni-konstanz.de/server/rdf/resource/123456789/71914"/>
<dc:creator>Singh, Shubhr</dc:creator>
<dc:contributor>Farnsworth, Andrew</dc:contributor>
<dc:creator>Strandburg-Peshkin, Ariana</dc:creator>
<dspace:isPartOfCollection rdf:resource="https://kops.uni-konstanz.de/server/rdf/resource/123456789/71914"/>
<dc:contributor>Benvent, David</dc:contributor>
<dc:date rdf:datatype="http://www.w3.org/2001/XMLSchema#dateTime">2025-07-03T10:51:56Z</dc:date>
<dc:creator>Gill, Lisa</dc:creator>
<dcterms:available rdf:datatype="http://www.w3.org/2001/XMLSchema#dateTime">2025-07-03T10:51:56Z</dcterms:available>
<dcterms:title>DCASE 2021 Task 5: Few-shot Bioacoustic Event Detection Development Set</dcterms:title>
<bibo:uri rdf:resource="https://kops.uni-konstanz.de/handle/123456789/73794"/>
<dc:contributor>Strandburg-Peshkin, Ariana</dc:contributor>
<dcterms:abstract><strong>General Description</strong> The development set for task 5 of DCASE 2021 "Few-shot Bioacoustic Event Detection" consists of 19 audio files acquired from different bioacoustic sources. The dataset is split into training and validation Sets. Multi-class annotations are provided for the training set with positive (POS), negative (NEG) and unkwown (UNK) values for each class. UNK indicates uncertainty about a class. Single-class (class of interest) annotations are provided for the validation set, with events marked as positive (POS) or unkwown (UNK) provided for the class of interest. <strong>Folder Structure</strong> <em>Development_Set.zip</em> |_Development_Set/ |__Training_Set/ |___BV/ |____*.wav |____*.csv |___HT/ |____*.wav |____*.csv |___JD/ |____*.wav |____*.csv |___MT/ |____*.wav |____*.csv |__Validation_Set/ |___HV/ |____*.wav |____*.csv |___PB/ |____*.wav |____*.csv <em>Development_Set_Audio.zip</em> has the same structure but contains only the *.wav files. <em>Development_Set_Annotations.zip</em> has the same structure but contains only the *.csv files <strong>Dataset statistics</strong> Some statistics on this dataset are as follows, split between training and validation set and their sub-folders: -----------------------------------------------------<br> TRAINING SET<br> -----------------------------------------------------<br> Number of audio recordings | 11<br> Total duration | 14 hours and 20 mins<br> Total classes (excl. UNK) | 19<br> Total events (excl. UNK) | 4,686<br> -----------------------------------------------------<br> TRAINING SET/BV<br> -----------------------------------------------------<br> Number of audio recordings | 5<br> Total duration | 10 hours<br> Total classes (excl. UNK) | 11<br> Total events (excl. UNK) | 2,662<br> Sampling rate | 24,000 Hz<br> -----------------------------------------------------<br> TRAINING SET/HT<br> -----------------------------------------------------<br> Number of audio recordings | 3<br> Total duration | 3 hours<br> Total classes (excl. UNK) | 3<br> Total events (excl. UNK) | 435<br> Sampling rate | 6,000 Hz<br> -----------------------------------------------------<br> TRAINING SET/JD<br> -----------------------------------------------------<br> Number of audio recordings | 1<br> Total duration | 10 mins<br> Total classes (excl. UNK) | 1<br> Total events (excl. UNK) | 355<br> Sampling rate | 22,050 Hz<br> -----------------------------------------------------<br> TRAINING SET/MT<br> -----------------------------------------------------<br> Number of audio recordings | 2<br> Total duration | 1 hour and 10 mins<br> Total classes (excl. UNK) | 4<br> Total events (excl. UNK) | 1,234<br> Sampling rate | 8,000 Hz<br> ----------------------------------------------------- <br> -----------------------------------------------------<br> VALIDATION SET<br> -----------------------------------------------------<br> Number of audio recordings | 8<br> Total duration | 5 hours<br> Total classes (excl. UNK) | 4<br> Total events (excl. UNK) | 310<br> -----------------------------------------------------<br> VALIDATION SET/HV<br> -----------------------------------------------------<br> Number of audio recordings | 2<br> Total duration | 2 hours<br> Total classes (excl. UNK) | 2<br> Total events (excl. UNK) | 50<br> Sampling rate | 6,000 Hz<br> -----------------------------------------------------<br> VALIDATION SET/PB<br> -----------------------------------------------------<br> Number of audio recordings | 6<br> Total duration | 3 hours<br> Total classes (excl. UNK) | 2<br> Total events (excl. UNK) | 260<br> Sampling rate | 44,100 Hz<br> ----------------------------------------------------- <strong>Annotation structure</strong> Each line of the annotation csv represents an event in the audio file. The column descriptions are as follows: TRAINING SET<br> ---------------------<br> Audiofilename, Starttime, Endtime, CLASS_1, CLASS_2, ...CLASS_N VALIDATION SET<br> ---------------------<br> Audiofilename, Starttime, Endtime, Q <strong>Classes</strong> DCASE2021_task5_training_set_classes.csv and DCASE2021_task5_validation_set_classes.csv provide a table with class code correspondace to class name for all classes in the Development set. DCASE2021_task5_training_set_classes.csv<br> ---------------------<br> dataset, class_code, class_name DCASE2021_task5_validation_set_classes.csv<br> ---------------------<br> dataset, recording, class_code, class_name <strong>Evaluation Set</strong> The Evaluation set for the same task can be found at: https://doi.org/10.5281/zenodo.5413149 <strong>Open Access</strong> This dataset is available under a Creative Commons Attribution 4.0 International (CC BY 4.0) license. <br> <strong>Contact info</strong> Please send any feedback or questions to:<br> Veronica Morfi: g.v.morfi@qmul.ac.uk<br></dcterms:abstract>
<dc:contributor>Pamula, Hanna</dc:contributor>
<dc:contributor>Morfi, Veronica</dc:contributor>
<dc:creator>Morfi, Veronica</dc:creator>
<dc:contributor>Nolasco, Ines</dc:contributor>
<dc:creator>Duteil, Mathieu</dc:creator>
<dcterms:rights rdf:resource="https://creativecommons.org/licenses/by/4.0/legalcode"/>
<dc:rights>Creative Commons Attribution 4.0 International</dc:rights>
<foaf:homepage rdf:resource="http://localhost:8080/"/>
<dc:creator>Stowell, Dan</dc:creator>
<dc:contributor>Sridhar, Sripathi</dc:contributor>
<dc:creator>Lostanlen, Vincent</dc:creator>
<dc:language>eng</dc:language>
<void:sparqlEndpoint rdf:resource="http://localhost/fuseki/dspace/sparql"/>
<dc:contributor>Lostanlen, Vincent</dc:contributor>
<dc:contributor>Stowell, Dan</dc:contributor>
<dc:contributor>Duteil, Mathieu</dc:contributor>
<dc:contributor>Singh, Shubhr</dc:contributor>
<dc:creator>Farnsworth, Andrew</dc:creator>
<dcterms:issued>2021</dcterms:issued>
<dcterms:created rdf:datatype="http://www.w3.org/2001/XMLSchema#dateTime">2021-02-19T10:33:15Z</dcterms:created>
</rdf:Description>
</rdf:RDF> | |
| kops.datacite.repository | Zenodo | |
| kops.flag.knbibliography | true | |
| relation.isAuthorOfDataset | cdaf3e23-9cf7-44e0-829b-7012dfae32e4 | |
| relation.isAuthorOfDataset | af067ed4-c145-4338-9c0b-6f4f4033004b | |
| relation.isAuthorOfDataset.latestForDiscovery | cdaf3e23-9cf7-44e0-829b-7012dfae32e4 |