F0 estimation for bioacoustics: A benchmark/training dataset of non-human vocalisations with annotated frequency contours

Best, Paul; Araya-Salas, Marcelo; Ekström, Axel G.; Freitas, Bárbara; Jensen, Frants H.; Kershenbaum, Arik; Lameira, Adriano R.; Lehmann, Kenna D. S.; Linhart, Pavel; Liu, Robert C.; Madhavan, Malavika; Markham, Andrew; Roch, Marie A.; Root-Gutteridge, Holly; Šálek, Martin; Smith-Vidaurre, Grace; Strandburg-Peshkin, Ariana; Warren, Megan R.; Wijers, Matthew; Marxer, Ricard

doi:10.5061/dryad.prr4xgxw8

Datensatz:
F0 estimation for bioacoustics: A benchmark/training dataset of non-human vocalisations with annotated frequency contours

Datum der Erstveröffentlichung

2025

Autor:innen

Best, Paul

Araya-Salas, Marcelo

Ekström, Axel G.

Freitas, Bárbara

Jensen, Frants H.

Kershenbaum, Arik

Lameira, Adriano R.

Lehmann, Kenna D. S.

Linhart, Pavel

Liu, Robert C.

10 mehr anzeigen

Repositorium der Erstveröffentlichung

DRYAD

DOI (Link zu den Daten)

https://doi.org/10.5061/dryad.prr4xgxw8

Link zur Lizenz

Creative Commons Zero v1.0 Universal

Angaben zur Forschungsförderung

U.S. National Science Foundation (NSF): OISE1853934
U.S. National Science Foundation (NSF): IOS1755089
Swiss National Science Foundation: PCEFP1_186841

Sammlungen

Biologie: Datensätze

Bewerten Sie die FAIRness der Forschungsdaten

Publikationsstatus

Published

Zusammenfassung

The fundamental frequency (F0) is a key parameter for characterising structures in vertebrate vocalisations, for instance defining vocal repertoires and their variations at different biological scales (e.g., population dialects, individual signatures). However, the task is too laborious to perform manually, and its automation is complex. Despite significant advancements in the fields of speech and music for automatic F0 estimation, similar progress in bioacoustics has been limited. To address this gap, we compile and publish a benchmark dataset of over 250,000 calls from 13 taxa, each paired with ground truth F0 values (each call are associated a series of time x frequency points delimitating its frequency contour). These vocalisations range from high to low SNR, from infra-sounds to ultra-sounds, from high to low harmonicity, and some include non-linear phenomena. This dataset allows to train supervised and/or self-supervised models in estimating F0 values (similarly to CREPE or PESTO for instance). Also, the provided ground truth allows to evaluate the performance and compare different algorithms on these signals (see the associated manuscript for a first benchmark and baseline). Pretrained models and scripts to train or evaluate models on this dataset are available on a separate github repository.

Fachgebiet (DDC)

570 Biowissenschaften, Biologie

Schlagwörter

FOS: Biological sciences, FOS: Biological sciences, Bioacoustics, fundamental frequency, non-human vocalisations, cross-species

Zugehörige Publikationen in KOPS

Publikation

Zeitschriftenartikel

Bioacoustic fundamental frequency estimation: a cross-species dataset and deep learning baseline

(2025) Best, Paul; Araya-Salas, Marcelo; Ekström, Axel G.; Freitas, Bárbara; Jensen, Frants H.; Kershenbaum, Arik; Lameira, Adriano R.; Strandburg-Peshkin, Ariana; Marxer, Ricard et al.

DOI: 10.1080/09524622.2025.2500380

Erschienen in: Bioacoustics. Taylor & Francis. 2025, 34(4), S. 419-446. ISSN 0952-4622. eISSN 2165-0586. Verfügbar unter: doi: 10.1080/09524622.2025.2500380

Link zu zugehöriger Publikation

https://doi.org/10.1080/09524622.2025.2500380

Zitieren

ISO 690

BEST, Paul, Marcelo ARAYA-SALAS, Axel G. EKSTRÖM, Bárbara FREITAS, Frants H. JENSEN, Arik KERSHENBAUM, Adriano R. LAMEIRA, Kenna D. S. LEHMANN, Pavel LINHART, Robert C. LIU, Malavika MADHAVAN, Andrew MARKHAM, Marie A. ROCH, Holly ROOT-GUTTERIDGE, Martin ŠÁLEK, Grace SMITH-VIDAURRE, Ariana STRANDBURG-PESHKIN, Megan R. WARREN, Matthew WIJERS, Ricard MARXER, 2025. F0 estimation for bioacoustics: A benchmark/training dataset of non-human vocalisations with annotated frequency contours

BibTex

RDF

<rdf:RDF
    xmlns:dcterms="http://purl.org/dc/terms/"
    xmlns:dc="http://purl.org/dc/elements/1.1/"
    xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
    xmlns:bibo="http://purl.org/ontology/bibo/"
    xmlns:dspace="http://digital-repositories.org/ontologies/dspace/0.1.0#"
    xmlns:foaf="http://xmlns.com/foaf/0.1/"
    xmlns:void="http://rdfs.org/ns/void#"
    xmlns:xsd="http://www.w3.org/2001/XMLSchema#" > 
  <rdf:Description rdf:about="https://kops.uni-konstanz.de/server/rdf/resource/123456789/73795">
    <dc:contributor>Šálek, Martin</dc:contributor>
    <dc:creator>Root-Gutteridge, Holly</dc:creator>
    <void:sparqlEndpoint rdf:resource="http://localhost/fuseki/dspace/sparql"/>
    <dc:contributor>Freitas, Bárbara</dc:contributor>
    <dc:creator>Smith-Vidaurre, Grace</dc:creator>
    <dc:contributor>Ekström, Axel G.</dc:contributor>
    <dc:contributor>Marxer, Ricard</dc:contributor>
    <dcterms:abstract>The fundamental frequency (F0) is a key parameter for characterising structures in vertebrate vocalisations, for instance defining vocal repertoires and their variations at different biological scales (e.g., population dialects, individual signatures). However, the task is too laborious to perform manually, and its automation is complex. Despite significant advancements in the fields of speech and music for automatic F0 estimation, similar progress in bioacoustics has been limited. To address this gap, we compile and publish a benchmark dataset of over 250,000 calls from 13 taxa, each paired with ground truth F0 values (each call are associated a series of time x frequency points delimitating its frequency contour). These vocalisations range from high to low SNR, from infra-sounds to ultra-sounds, from high to low harmonicity, and some include non-linear phenomena. This dataset allows to train supervised and/or self-supervised models in estimating F0 values (similarly to CREPE or PESTO for instance). Also, the provided ground truth allows to evaluate the performance and compare different algorithms on these signals (see the associated manuscript for a first benchmark and baseline). Pretrained models and scripts to train or evaluate models on this dataset are available on a separate github repository.</dcterms:abstract>
    <dc:contributor>Markham, Andrew</dc:contributor>
    <dc:creator>Kershenbaum, Arik</dc:creator>
    <dc:creator>Markham, Andrew</dc:creator>
    <dc:contributor>Linhart, Pavel</dc:contributor>
    <dc:creator>Marxer, Ricard</dc:creator>
    <dc:contributor>Root-Gutteridge, Holly</dc:contributor>
    <dc:creator>Wijers, Matthew</dc:creator>
    <dc:contributor>Best, Paul</dc:contributor>
    <dc:creator>Araya-Salas, Marcelo</dc:creator>
    <dcterms:available rdf:datatype="http://www.w3.org/2001/XMLSchema#dateTime">2025-07-03T11:05:17Z</dcterms:available>
    <dc:creator>Best, Paul</dc:creator>
    <dc:date rdf:datatype="http://www.w3.org/2001/XMLSchema#dateTime">2025-07-03T11:05:17Z</dc:date>
    <dc:creator>Freitas, Bárbara</dc:creator>
    <dc:contributor>Roch, Marie A.</dc:contributor>
    <dc:creator>Madhavan, Malavika</dc:creator>
    <dc:creator>Linhart, Pavel</dc:creator>
    <dc:contributor>Warren, Megan R.</dc:contributor>
    <dcterms:isPartOf rdf:resource="https://kops.uni-konstanz.de/server/rdf/resource/123456789/71914"/>
    <dcterms:title>F0 estimation for bioacoustics: A benchmark/training dataset of non-human vocalisations with annotated frequency contours</dcterms:title>
    <dspace:isPartOfCollection rdf:resource="https://kops.uni-konstanz.de/server/rdf/resource/123456789/71914"/>
    <dc:contributor>Strandburg-Peshkin, Ariana</dc:contributor>
    <dc:contributor>Liu, Robert C.</dc:contributor>
    <dc:contributor>Madhavan, Malavika</dc:contributor>
    <dc:creator>Warren, Megan R.</dc:creator>
    <dc:creator>Lehmann, Kenna D. S.</dc:creator>
    <dc:creator>Strandburg-Peshkin, Ariana</dc:creator>
    <dc:creator>Roch, Marie A.</dc:creator>
    <dc:contributor>Wijers, Matthew</dc:contributor>
    <dcterms:issued>2025</dcterms:issued>
    <dcterms:isReferencedBy>10.1080/09524622.2025.2500380</dcterms:isReferencedBy>
    <dc:creator>Ekström, Axel G.</dc:creator>
    <dc:contributor>Kershenbaum, Arik</dc:contributor>
    <dc:contributor>Jensen, Frants H.</dc:contributor>
    <bibo:uri rdf:resource="https://kops.uni-konstanz.de/handle/123456789/73795"/>
    <dc:contributor>Smith-Vidaurre, Grace</dc:contributor>
    <dcterms:created rdf:datatype="http://www.w3.org/2001/XMLSchema#dateTime">2025-05-08T08:19:53Z</dcterms:created>
    <dc:creator>Liu, Robert C.</dc:creator>
    <dc:contributor>Araya-Salas, Marcelo</dc:contributor>
    <dc:language>eng</dc:language>
    <dcterms:rights rdf:resource="https://creativecommons.org/publicdomain/zero/1.0/legalcode"/>
    <foaf:homepage rdf:resource="http://localhost:8080/"/>
    <dc:creator>Šálek, Martin</dc:creator>
    <dc:contributor>Lameira, Adriano R.</dc:contributor>
    <dc:contributor>Lehmann, Kenna D. S.</dc:contributor>
    <dc:rights>Creative Commons Zero v1.0 Universal</dc:rights>
    <dc:creator>Jensen, Frants H.</dc:creator>
    <dc:creator>Lameira, Adriano R.</dc:creator>
  </rdf:Description>
</rdf:RDF>

Universitätsbibliographie

Ja

Datensatz:
F0 estimation for bioacoustics: A benchmark/training dataset of non-human vocalisations with annotated frequency contours

Datum der Erstveröffentlichung

Autor:innen

Andere Beitragende

Repositorium der Erstveröffentlichung

Version des Datensatzes

DOI (Link zu den Daten)

Link zur Lizenz

Angaben zur Forschungsförderung

Projekt

Sammlungen

Core Facility der Universität Konstanz

Bewerten Sie die FAIRness der Forschungsdaten

Gesperrt bis

Titel in einer weiteren Sprache

Publikationsstatus

Zusammenfassung

Zusammenfassung in einer weiteren Sprache

Fachgebiet (DDC)

Schlagwörter

Zugehörige Publikationen in KOPS

Link zu zugehöriger Publikation

Link zu zugehörigem Datensatz

Zitieren

URL (Link zu den Daten)

Prüfdatum der URL

Kommentar zur Publikation

Universitätsbibliographie

Diese Publikation teilen

Datensatz: F0 estimation for bioacoustics: A benchmark/training dataset of non-human vocalisations with annotated frequency contours

Datum der Erstveröffentlichung

Autor:innen

Andere Beitragende

Repositorium der Erstveröffentlichung

Version des Datensatzes

DOI (Link zu den Daten)

Link zur Lizenz

Angaben zur Forschungsförderung

Projekt

Sammlungen

Core Facility der Universität Konstanz

Bewerten Sie die FAIRness der Forschungsdaten

Gesperrt bis

Titel in einer weiteren Sprache

Publikationsstatus

Zusammenfassung

Zusammenfassung in einer weiteren Sprache

Fachgebiet (DDC)

Schlagwörter

Zugehörige Publikationen in KOPS

Link zu zugehöriger Publikation

Link zu zugehörigem Datensatz

Zitieren

URL (Link zu den Daten)

Prüfdatum der URL

Kommentar zur Publikation

Universitätsbibliographie

Diese Publikation teilen

Datensatz:
F0 estimation for bioacoustics: A benchmark/training dataset of non-human vocalisations with annotated frequency contours