Bioacoustic fundamental frequency estimation: a cross-species dataset and deep learning baseline

Best, Paul; Araya-Salas, Marcelo; Ekström, Axel G.; Freitas, Bárbara; Jensen, Frants H.; Kershenbaum, Arik; Lameira, Adriano R.; Strandburg-Peshkin, Ariana; Marxer, Ricard

doi:10.1080/09524622.2025.2500380

Bioacoustic fundamental frequency estimation: a cross-species dataset and deep learning baseline

dc.contributor.author	Best, Paul
dc.contributor.author	Araya-Salas, Marcelo
dc.contributor.author	Ekström, Axel G.
dc.contributor.author	Freitas, Bárbara
dc.contributor.author	Jensen, Frants H.
dc.contributor.author	Kershenbaum, Arik
dc.contributor.author	Lameira, Adriano R.
dc.contributor.author	Strandburg-Peshkin, Ariana
dc.contributor.author	Marxer, Ricard
dc.date.accessioned	2025-06-10T07:25:09Z
dc.date.available	2025-06-10T07:25:09Z
dc.date.issued	2025-07-04
dc.description.abstract	The fundamental frequency (F0) is a key parameter for characterising structures in vertebrate vocalisations, for instance defining vocal repertoires and their variations at different biological scales (e.g. population dialects, individual signatures). However, the task is too laborious to perform manually, and its automation is complex. Despite significant advancements in the fields of speech and music for automatic F0 estimation, similar progress in bioacoustics has been limited. To address this gap, we compile and publish a benchmark dataset of over 250,000 calls from 14 taxa, each paired with ground truth F0 values. These vocalisations range from infra-sounds to ultra-sounds, from high to low harmonicity, and some include non-linear phenomena. Testing different algorithms on these signals, we demonstrate the potential of neural networks for F0 estimation, even for taxa not seen in training, or when trained without labels. Also, to inform on the applicability of algorithms to analyse signals, we propose spectral measurements of F0 quality which correlate well with performance. While current performance results are not satisfying for all studied taxa, they suggest that deep learning could bring a more generic and reliable bioacoustic F0 tracker, helping the community to analyse vocalisations via their F0 contours.
dc.description.version	published	deu
dc.identifier.doi	10.1080/09524622.2025.2500380
dc.identifier.ppn	1930424647
dc.identifier.uri	https://kops.uni-konstanz.de/handle/123456789/73543
dc.language.iso	eng
dc.rights	terms-of-use
dc.rights.uri	https://rightsstatements.org/page/InC/1.0/
dc.subject.ddc	570
dc.title	Bioacoustic fundamental frequency estimation: a cross-species dataset and deep learning baseline	eng
dc.type	JOURNAL_ARTICLE
dspace.entity.type	Publication
kops.citation.bibtex	@article{Best2025-07-04Bioac-73543, title={Bioacoustic fundamental frequency estimation: a cross-species dataset and deep learning baseline}, year={2025}, doi={10.1080/09524622.2025.2500380}, number={4}, volume={34}, issn={0952-4622}, journal={Bioacoustics}, pages={419--446}, author={Best, Paul and Araya-Salas, Marcelo and Ekström, Axel G. and Freitas, Bárbara and Jensen, Frants H. and Kershenbaum, Arik and Lameira, Adriano R. and Strandburg-Peshkin, Ariana and Marxer, Ricard} }
kops.citation.iso690	BEST, Paul, Marcelo ARAYA-SALAS, Axel G. EKSTRÖM, Bárbara FREITAS, Frants H. JENSEN, Arik KERSHENBAUM, Adriano R. LAMEIRA, Ariana STRANDBURG-PESHKIN, Ricard MARXER, 2025. Bioacoustic fundamental frequency estimation: a cross-species dataset and deep learning baseline. In: Bioacoustics. Taylor & Francis. 2025, 34(4), S. 419-446. ISSN 0952-4622. eISSN 2165-0586. Verfügbar unter: doi: 10.1080/09524622.2025.2500380	deu
kops.citation.iso690	BEST, Paul, Marcelo ARAYA-SALAS, Axel G. EKSTRÖM, Bárbara FREITAS, Frants H. JENSEN, Arik KERSHENBAUM, Adriano R. LAMEIRA, Ariana STRANDBURG-PESHKIN, Ricard MARXER, 2025. Bioacoustic fundamental frequency estimation: a cross-species dataset and deep learning baseline. In: Bioacoustics. Taylor & Francis. 2025, 34(4), pp. 419-446. ISSN 0952-4622. eISSN 2165-0586. Available under: doi: 10.1080/09524622.2025.2500380	eng
kops.citation.rdf	<rdf:RDF xmlns:dcterms="http://purl.org/dc/terms/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:bibo="http://purl.org/ontology/bibo/" xmlns:dspace="http://digital-repositories.org/ontologies/dspace/0.1.0#" xmlns:foaf="http://xmlns.com/foaf/0.1/" xmlns:void="http://rdfs.org/ns/void#" xmlns:xsd="http://www.w3.org/2001/XMLSchema#" > <rdf:Description rdf:about="https://kops.uni-konstanz.de/server/rdf/resource/123456789/73543"> <dc:contributor>Ekström, Axel G.</dc:contributor> <dc:creator>Lameira, Adriano R.</dc:creator> <dc:creator>Jensen, Frants H.</dc:creator> <dc:contributor>Jensen, Frants H.</dc:contributor> <bibo:uri rdf:resource="https://kops.uni-konstanz.de/handle/123456789/73543"/> <foaf:homepage rdf:resource="http://localhost:8080/"/> <dc:contributor>Best, Paul</dc:contributor> <dc:contributor>Kershenbaum, Arik</dc:contributor> <void:sparqlEndpoint rdf:resource="http://localhost/fuseki/dspace/sparql"/> <dcterms:abstract>The fundamental frequency (F0) is a key parameter for characterising structures in vertebrate vocalisations, for instance defining vocal repertoires and their variations at different biological scales (e.g. population dialects, individual signatures). However, the task is too laborious to perform manually, and its automation is complex. Despite significant advancements in the fields of speech and music for automatic F0 estimation, similar progress in bioacoustics has been limited. To address this gap, we compile and publish a benchmark dataset of over 250,000 calls from 14 taxa, each paired with ground truth F0 values. These vocalisations range from infra-sounds to ultra-sounds, from high to low harmonicity, and some include non-linear phenomena. Testing different algorithms on these signals, we demonstrate the potential of neural networks for F0 estimation, even for taxa not seen in training, or when trained without labels. Also, to inform on the applicability of algorithms to analyse signals, we propose spectral measurements of F0 quality which correlate well with performance. While current performance results are not satisfying for all studied taxa, they suggest that deep learning could bring a more generic and reliable bioacoustic F0 tracker, helping the community to analyse vocalisations via their F0 contours.</dcterms:abstract> <dc:creator>Strandburg-Peshkin, Ariana</dc:creator> <dc:rights>terms-of-use</dc:rights> <dcterms:available rdf:datatype="http://www.w3.org/2001/XMLSchema#dateTime">2025-06-10T07:25:09Z</dcterms:available> <dcterms:issued>2025-07-04</dcterms:issued> <dc:creator>Araya-Salas, Marcelo</dc:creator> <dc:contributor>Araya-Salas, Marcelo</dc:contributor> <dc:contributor>Lameira, Adriano R.</dc:contributor> <dc:creator>Best, Paul</dc:creator> <dc:contributor>Marxer, Ricard</dc:contributor> <dcterms:rights rdf:resource="https://rightsstatements.org/page/InC/1.0/"/> <dc:creator>Ekström, Axel G.</dc:creator> <dc:creator>Marxer, Ricard</dc:creator> <dc:language>eng</dc:language> <dcterms:hasPart rdf:resource="https://kops.uni-konstanz.de/bitstream/123456789/73543/1/Best_2-9o5wnofv2qwa8.pdf"/> <dcterms:title>Bioacoustic fundamental frequency estimation: a cross-species dataset and deep learning baseline</dcterms:title> <dc:contributor>Freitas, Bárbara</dc:contributor> <dspace:hasBitstream rdf:resource="https://kops.uni-konstanz.de/bitstream/123456789/73543/1/Best_2-9o5wnofv2qwa8.pdf"/> <dc:date rdf:datatype="http://www.w3.org/2001/XMLSchema#dateTime">2025-06-10T07:25:09Z</dc:date> <dc:creator>Freitas, Bárbara</dc:creator> <dc:contributor>Strandburg-Peshkin, Ariana</dc:contributor> <dcterms:isPartOf rdf:resource="https://kops.uni-konstanz.de/server/rdf/resource/123456789/28"/> <dspace:isPartOfCollection rdf:resource="https://kops.uni-konstanz.de/server/rdf/resource/123456789/28"/> <dc:creator>Kershenbaum, Arik</dc:creator> </rdf:Description> </rdf:RDF>
kops.description.funding	{"second":"OISE1853934","first":"nsf"}
kops.description.openAccess	openaccessgreen
kops.flag.etalAuthor	true
kops.flag.isPeerReviewed	true
kops.flag.knbibliography	true
kops.identifier.nbn	urn:nbn:de:bsz:352-2-9o5wnofv2qwa8
kops.sourcefield	Bioacoustics. Taylor & Francis. 2025, <b>34</b>(4), S. 419-446. ISSN 0952-4622. eISSN 2165-0586. Verfügbar unter: doi: 10.1080/09524622.2025.2500380	deu
kops.sourcefield.plain	Bioacoustics. Taylor & Francis. 2025, 34(4), S. 419-446. ISSN 0952-4622. eISSN 2165-0586. Verfügbar unter: doi: 10.1080/09524622.2025.2500380	deu
kops.sourcefield.plain	Bioacoustics. Taylor & Francis. 2025, 34(4), pp. 419-446. ISSN 0952-4622. eISSN 2165-0586. Available under: doi: 10.1080/09524622.2025.2500380	eng
relation.isAuthorOfPublication	cdaf3e23-9cf7-44e0-829b-7012dfae32e4
relation.isAuthorOfPublication.latestForDiscovery	cdaf3e23-9cf7-44e0-829b-7012dfae32e4
relation.isDatasetOfPublication	5a954d50-3133-4b0a-9a3d-e1ac2564b619
relation.isDatasetOfPublication.latestForDiscovery	5a954d50-3133-4b0a-9a3d-e1ac2564b619
source.bibliographicInfo.fromPage	419
source.bibliographicInfo.issue	4
source.bibliographicInfo.toPage	446
source.bibliographicInfo.volume	34
source.identifier.eissn	2165-0586
source.identifier.issn	0952-4622
source.periodicalTitle	Bioacoustics
source.publisher	Taylor & Francis
temp.date.embargoEnd	2026-06-15
temp.description.funding	{"first":"Foundation for Science and Technology","second":"2020.04569.BD"}

Dateien

Originalbündel

Gerade angezeigt 1 - 1 von 1

Name:: Best_2-9o5wnofv2qwa8.pdf
Größe:: 2.73 MB
Format:: Adobe Portable Document Format

Best_2-9o5wnofv2qwa8.pdfGröße: 2.73 MBDownloads: ?

Sammlungen

Biologie: Publikationen