F0 estimation for bioacoustics: A benchmark/training dataset of non-human vocalisations with annotated frequency contours

dc.contributor.authorBest, Paul
dc.contributor.authorAraya-Salas, Marcelo
dc.contributor.authorEkström, Axel G.
dc.contributor.authorFreitas, Bárbara
dc.contributor.authorJensen, Frants H.
dc.contributor.authorKershenbaum, Arik
dc.contributor.authorLameira, Adriano R.
dc.contributor.authorLehmann, Kenna D. S.
dc.contributor.authorLinhart, Pavel
dc.contributor.authorLiu, Robert C.
dc.contributor.authorMadhavan, Malavika
dc.contributor.authorMarkham, Andrew
dc.contributor.authorRoch, Marie A.
dc.contributor.authorRoot-Gutteridge, Holly
dc.contributor.authorŠálek, Martin
dc.contributor.authorSmith-Vidaurre, Grace
dc.contributor.authorStrandburg-Peshkin, Ariana
dc.contributor.authorWarren, Megan R.
dc.contributor.authorWijers, Matthew
dc.contributor.authorMarxer, Ricard
dc.date.accessioned2025-07-03T11:05:17Z
dc.date.available2025-07-03T11:05:17Z
dc.date.created2025-05-08T08:19:53Z
dc.date.issued2025
dc.description.abstractThe fundamental frequency (F0) is a key parameter for characterising structures in vertebrate vocalisations, for instance defining vocal repertoires and their variations at different biological scales (e.g., population dialects, individual signatures). However, the task is too laborious to perform manually, and its automation is complex. Despite significant advancements in the fields of speech and music for automatic F0 estimation, similar progress in bioacoustics has been limited. To address this gap, we compile and publish a benchmark dataset of over 250,000 calls from 13 taxa, each paired with ground truth F0 values (each call are associated a series of time x frequency points delimitating its frequency contour). These vocalisations range from high to low SNR, from infra-sounds to ultra-sounds, from high to low harmonicity, and some include non-linear phenomena. This dataset allows to train supervised and/or self-supervised models in estimating F0 values (similarly to CREPE or PESTO for instance). Also, the provided ground truth allows to evaluate the performance and compare different algorithms on these signals (see the associated manuscript for a first benchmark and baseline). Pretrained models and scripts to train or evaluate models on this dataset are available on a separate github repository.
dc.description.versionpublisheddeu
dc.identifier.doi10.5061/dryad.prr4xgxw8
dc.identifier.urihttps://kops.uni-konstanz.de/handle/123456789/73795
dc.language.isoeng
dc.relation.isreferencedby10.1080/09524622.2025.2500380
dc.rightsCreative Commons Zero v1.0 Universal
dc.rights.urihttps://creativecommons.org/publicdomain/zero/1.0/legalcode
dc.subjectFOS: Biological sciences
dc.subjectFOS: Biological sciences
dc.subjectBioacoustics
dc.subjectfundamental frequency
dc.subjectnon-human vocalisations
dc.subjectcross-species
dc.subject.ddc570
dc.titleF0 estimation for bioacoustics: A benchmark/training dataset of non-human vocalisations with annotated frequency contourseng
dspace.entity.typeDataset
kops.citation.bibtex
kops.citation.iso690BEST, Paul, Marcelo ARAYA-SALAS, Axel G. EKSTRÖM, Bárbara FREITAS, Frants H. JENSEN, Arik KERSHENBAUM, Adriano R. LAMEIRA, Kenna D. S. LEHMANN, Pavel LINHART, Robert C. LIU, Malavika MADHAVAN, Andrew MARKHAM, Marie A. ROCH, Holly ROOT-GUTTERIDGE, Martin ŠÁLEK, Grace SMITH-VIDAURRE, Ariana STRANDBURG-PESHKIN, Megan R. WARREN, Matthew WIJERS, Ricard MARXER, 2025. F0 estimation for bioacoustics: A benchmark/training dataset of non-human vocalisations with annotated frequency contoursdeu
kops.citation.iso690BEST, Paul, Marcelo ARAYA-SALAS, Axel G. EKSTRÖM, Bárbara FREITAS, Frants H. JENSEN, Arik KERSHENBAUM, Adriano R. LAMEIRA, Kenna D. S. LEHMANN, Pavel LINHART, Robert C. LIU, Malavika MADHAVAN, Andrew MARKHAM, Marie A. ROCH, Holly ROOT-GUTTERIDGE, Martin ŠÁLEK, Grace SMITH-VIDAURRE, Ariana STRANDBURG-PESHKIN, Megan R. WARREN, Matthew WIJERS, Ricard MARXER, 2025. F0 estimation for bioacoustics: A benchmark/training dataset of non-human vocalisations with annotated frequency contourseng
kops.citation.rdf
<rdf:RDF
    xmlns:dcterms="http://purl.org/dc/terms/"
    xmlns:dc="http://purl.org/dc/elements/1.1/"
    xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
    xmlns:bibo="http://purl.org/ontology/bibo/"
    xmlns:dspace="http://digital-repositories.org/ontologies/dspace/0.1.0#"
    xmlns:foaf="http://xmlns.com/foaf/0.1/"
    xmlns:void="http://rdfs.org/ns/void#"
    xmlns:xsd="http://www.w3.org/2001/XMLSchema#" > 
  <rdf:Description rdf:about="https://kops.uni-konstanz.de/server/rdf/resource/123456789/73795">
    <dc:contributor>Šálek, Martin</dc:contributor>
    <dc:creator>Root-Gutteridge, Holly</dc:creator>
    <void:sparqlEndpoint rdf:resource="http://localhost/fuseki/dspace/sparql"/>
    <dc:contributor>Freitas, Bárbara</dc:contributor>
    <dc:creator>Smith-Vidaurre, Grace</dc:creator>
    <dc:contributor>Ekström, Axel G.</dc:contributor>
    <dc:contributor>Marxer, Ricard</dc:contributor>
    <dcterms:abstract>The fundamental frequency (F0) is a key parameter for characterising structures in vertebrate vocalisations, for instance defining vocal repertoires and their variations at different biological scales (e.g., population dialects, individual signatures). However, the task is too laborious to perform manually, and its automation is complex. Despite significant advancements in the fields of speech and music for automatic F0 estimation, similar progress in bioacoustics has been limited. To address this gap, we compile and publish a benchmark dataset of over 250,000 calls from 13 taxa, each paired with ground truth F0 values (each call are associated a series of time x frequency points delimitating its frequency contour). These vocalisations range from high to low SNR, from infra-sounds to ultra-sounds, from high to low harmonicity, and some include non-linear phenomena. This dataset allows to train supervised and/or self-supervised models in estimating F0 values (similarly to CREPE or PESTO for instance). Also, the provided ground truth allows to evaluate the performance and compare different algorithms on these signals (see the associated manuscript for a first benchmark and baseline). Pretrained models and scripts to train or evaluate models on this dataset are available on a separate github repository.</dcterms:abstract>
    <dc:contributor>Markham, Andrew</dc:contributor>
    <dc:creator>Kershenbaum, Arik</dc:creator>
    <dc:creator>Markham, Andrew</dc:creator>
    <dc:contributor>Linhart, Pavel</dc:contributor>
    <dc:creator>Marxer, Ricard</dc:creator>
    <dc:contributor>Root-Gutteridge, Holly</dc:contributor>
    <dc:creator>Wijers, Matthew</dc:creator>
    <dc:contributor>Best, Paul</dc:contributor>
    <dc:creator>Araya-Salas, Marcelo</dc:creator>
    <dcterms:available rdf:datatype="http://www.w3.org/2001/XMLSchema#dateTime">2025-07-03T11:05:17Z</dcterms:available>
    <dc:creator>Best, Paul</dc:creator>
    <dc:date rdf:datatype="http://www.w3.org/2001/XMLSchema#dateTime">2025-07-03T11:05:17Z</dc:date>
    <dc:creator>Freitas, Bárbara</dc:creator>
    <dc:contributor>Roch, Marie A.</dc:contributor>
    <dc:creator>Madhavan, Malavika</dc:creator>
    <dc:creator>Linhart, Pavel</dc:creator>
    <dc:contributor>Warren, Megan R.</dc:contributor>
    <dcterms:isPartOf rdf:resource="https://kops.uni-konstanz.de/server/rdf/resource/123456789/71914"/>
    <dcterms:title>F0 estimation for bioacoustics: A benchmark/training dataset of non-human vocalisations with annotated frequency contours</dcterms:title>
    <dspace:isPartOfCollection rdf:resource="https://kops.uni-konstanz.de/server/rdf/resource/123456789/71914"/>
    <dc:contributor>Strandburg-Peshkin, Ariana</dc:contributor>
    <dc:contributor>Liu, Robert C.</dc:contributor>
    <dc:contributor>Madhavan, Malavika</dc:contributor>
    <dc:creator>Warren, Megan R.</dc:creator>
    <dc:creator>Lehmann, Kenna D. S.</dc:creator>
    <dc:creator>Strandburg-Peshkin, Ariana</dc:creator>
    <dc:creator>Roch, Marie A.</dc:creator>
    <dc:contributor>Wijers, Matthew</dc:contributor>
    <dcterms:issued>2025</dcterms:issued>
    <dcterms:isReferencedBy>10.1080/09524622.2025.2500380</dcterms:isReferencedBy>
    <dc:creator>Ekström, Axel G.</dc:creator>
    <dc:contributor>Kershenbaum, Arik</dc:contributor>
    <dc:contributor>Jensen, Frants H.</dc:contributor>
    <bibo:uri rdf:resource="https://kops.uni-konstanz.de/handle/123456789/73795"/>
    <dc:contributor>Smith-Vidaurre, Grace</dc:contributor>
    <dcterms:created rdf:datatype="http://www.w3.org/2001/XMLSchema#dateTime">2025-05-08T08:19:53Z</dcterms:created>
    <dc:creator>Liu, Robert C.</dc:creator>
    <dc:contributor>Araya-Salas, Marcelo</dc:contributor>
    <dc:language>eng</dc:language>
    <dcterms:rights rdf:resource="https://creativecommons.org/publicdomain/zero/1.0/legalcode"/>
    <foaf:homepage rdf:resource="http://localhost:8080/"/>
    <dc:creator>Šálek, Martin</dc:creator>
    <dc:contributor>Lameira, Adriano R.</dc:contributor>
    <dc:contributor>Lehmann, Kenna D. S.</dc:contributor>
    <dc:rights>Creative Commons Zero v1.0 Universal</dc:rights>
    <dc:creator>Jensen, Frants H.</dc:creator>
    <dc:creator>Lameira, Adriano R.</dc:creator>
  </rdf:Description>
</rdf:RDF>
kops.datacite.repositoryDRYAD
kops.description.funding{"second":"OISE1853934","first":"nsf"}
kops.description.funding{"second":"IOS1755089","first":"nsf"}
kops.description.funding{"second":"PCEFP1_186841","first":"snsf"}
kops.flag.knbibliographytrue
relation.isAuthorOfDatasetcdaf3e23-9cf7-44e0-829b-7012dfae32e4
relation.isAuthorOfDataset.latestForDiscoverycdaf3e23-9cf7-44e0-829b-7012dfae32e4
relation.isPublicationOfDataset11aa66f9-dca1-4faa-a338-64b4d0db8b74
relation.isPublicationOfDatasetBioacoustic fundamental frequency estimation: a cross-species dataset and deep learning baseline
relation.isPublicationOfDataset.latestForDiscovery11aa66f9-dca1-4faa-a338-64b4d0db8b74

Dateien