Publikation: UniCor and UniCorP : A Novel Metric and Hierarchical Feature Selection Algorithm for Microbial Community Analysis
Dateien
Datum
Autor:innen
Herausgeber:innen
ISSN der Zeitschrift
Electronic ISSN
ISBN
Bibliografische Daten
Verlag
Schriftenreihe
Auflagebezeichnung
URI (zitierfähiger Link)
DOI (zitierfähiger Link)
Internationale Patentnummer
Link zur Lizenz
Angaben zur Forschungsförderung
Projekt
Open Access-Veröffentlichung
Core Facility der Universität Konstanz
Titel in einer weiteren Sprache
Publikationstyp
Publikationsstatus
Erschienen in
Zusammenfassung
The rapid advancement of technologies and methods in the life sciences has significantly increased the availability of big data, presenting new challenges for its analysis. Microbiome datasets, in particular, are characterized by extensive feature sets with defined but complex hierarchical structures that are often overlooked or underutilized. Here we introduce a novel metric, UniCor, to identify UNIquely CORrelated eNtities (UNICORNs) in quantitative datasets associated with continuous target variables. These datasets may include microbiome community structures in relation to environmental factors (e.g., temperature, pH, salinity) or biotic variables (e.g., thermal tolerance, oxidative stress). The UniCor metric combines the uniqueness of a given feature within a dataset with its correlation to a target variable of interest. To further enhance its utility, we developed a propagation algorithm (UniCorP), which exploits inherent dataset hierarchies, such as taxonomic levels in microbiome datasets, by selecting and propagating features based on their UniCor metric. Using bacterial community datasets with hierarchical taxonomic annotations and various continuous environmental variables, we demonstrate the ability of the novel metric to reduce features and increase predictive performance in cross-validated Random Forest Regressions (RFR). After propagating features with UniCorP and enriching the hierarchical levels with UNICORNs, the predictive performance consistently outperformed control trials for taxonomic aggregation, even at the least granular hierarchical level, allowing a substantial reduction of the feature space. We also compared the metric to existing methods for feature aggregation, showing that it offers stable, competitive predictive performance and feature reduction, within a simple and adaptable framework.
Zusammenfassung in einer weiteren Sprache
Fachgebiet (DDC)
Schlagwörter
Konferenz
Rezension
Zitieren
ISO 690
STAAB, Sebastian, Kim-Isabelle MAYER, Anny CÁRDENAS, Raquel PEIXOTO, Falk SCHREIBER, Christian R. VOOLSTRA, 2025. UniCor and UniCorP : A Novel Metric and Hierarchical Feature Selection Algorithm for Microbial Community Analysis. In: ISME Communications. Oxford University Press (OUP). 2025, 5(1), ycaf174. eISSN 2730-6151. Verfügbar unter: doi: 10.1093/ismeco/ycaf174BibTex
@article{Staab2025-01-17UniCo-74737,
title={UniCor and UniCorP : A Novel Metric and Hierarchical Feature Selection Algorithm for Microbial Community Analysis},
year={2025},
doi={10.1093/ismeco/ycaf174},
number={1},
volume={5},
journal={ISME Communications},
author={Staab, Sebastian and Mayer, Kim-Isabelle and Cárdenas, Anny and Peixoto, Raquel and Schreiber, Falk and Voolstra, Christian R.},
note={Article Number: ycaf174}
}RDF
<rdf:RDF
xmlns:dcterms="http://purl.org/dc/terms/"
xmlns:dc="http://purl.org/dc/elements/1.1/"
xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
xmlns:bibo="http://purl.org/ontology/bibo/"
xmlns:dspace="http://digital-repositories.org/ontologies/dspace/0.1.0#"
xmlns:foaf="http://xmlns.com/foaf/0.1/"
xmlns:void="http://rdfs.org/ns/void#"
xmlns:xsd="http://www.w3.org/2001/XMLSchema#" >
<rdf:Description rdf:about="https://kops.uni-konstanz.de/server/rdf/resource/123456789/74737">
<foaf:homepage rdf:resource="http://localhost:8080/"/>
<dcterms:title>UniCor and UniCorP : A Novel Metric and Hierarchical Feature Selection Algorithm for Microbial Community Analysis</dcterms:title>
<dc:creator>Voolstra, Christian R.</dc:creator>
<dc:creator>Staab, Sebastian</dc:creator>
<dc:creator>Schreiber, Falk</dc:creator>
<dc:contributor>Cárdenas, Anny</dc:contributor>
<dcterms:abstract>The rapid advancement of technologies and methods in the life sciences has significantly increased the availability of big data, presenting new challenges for its analysis. Microbiome datasets, in particular, are characterized by extensive feature sets with defined but complex hierarchical structures that are often overlooked or underutilized. Here we introduce a novel metric, UniCor, to identify UNIquely CORrelated eNtities (UNICORNs) in quantitative datasets associated with continuous target variables. These datasets may include microbiome community structures in relation to environmental factors (e.g., temperature, pH, salinity) or biotic variables (e.g., thermal tolerance, oxidative stress). The UniCor metric combines the uniqueness of a given feature within a dataset with its correlation to a target variable of interest. To further enhance its utility, we developed a propagation algorithm (UniCorP), which exploits inherent dataset hierarchies, such as taxonomic levels in microbiome datasets, by selecting and propagating features based on their UniCor metric. Using bacterial community datasets with hierarchical taxonomic annotations and various continuous environmental variables, we demonstrate the ability of the novel metric to reduce features and increase predictive performance in cross-validated Random Forest Regressions (RFR). After propagating features with UniCorP and enriching the hierarchical levels with UNICORNs, the predictive performance consistently outperformed control trials for taxonomic aggregation, even at the least granular hierarchical level, allowing a substantial reduction of the feature space. We also compared the metric to existing methods for feature aggregation, showing that it offers stable, competitive predictive performance and feature reduction, within a simple and adaptable framework.</dcterms:abstract>
<dc:contributor>Mayer, Kim-Isabelle</dc:contributor>
<dc:creator>Cárdenas, Anny</dc:creator>
<dspace:isPartOfCollection rdf:resource="https://kops.uni-konstanz.de/server/rdf/resource/123456789/28"/>
<dcterms:issued>2025-01-17</dcterms:issued>
<dc:contributor>Schreiber, Falk</dc:contributor>
<void:sparqlEndpoint rdf:resource="http://localhost/fuseki/dspace/sparql"/>
<dcterms:isPartOf rdf:resource="https://kops.uni-konstanz.de/server/rdf/resource/123456789/36"/>
<dc:contributor>Peixoto, Raquel</dc:contributor>
<dc:contributor>Staab, Sebastian</dc:contributor>
<dc:date rdf:datatype="http://www.w3.org/2001/XMLSchema#dateTime">2025-10-08T11:34:10Z</dc:date>
<dspace:hasBitstream rdf:resource="https://kops.uni-konstanz.de/bitstream/123456789/74737/1/Staab_2-53t7iq6v4lk11.pdf"/>
<dc:language>eng</dc:language>
<dcterms:available rdf:datatype="http://www.w3.org/2001/XMLSchema#dateTime">2025-10-08T11:34:10Z</dcterms:available>
<dc:contributor>Voolstra, Christian R.</dc:contributor>
<dc:rights>Attribution 4.0 International</dc:rights>
<dspace:isPartOfCollection rdf:resource="https://kops.uni-konstanz.de/server/rdf/resource/123456789/36"/>
<dcterms:hasPart rdf:resource="https://kops.uni-konstanz.de/bitstream/123456789/74737/1/Staab_2-53t7iq6v4lk11.pdf"/>
<dc:creator>Peixoto, Raquel</dc:creator>
<dcterms:isPartOf rdf:resource="https://kops.uni-konstanz.de/server/rdf/resource/123456789/28"/>
<dc:creator>Mayer, Kim-Isabelle</dc:creator>
<dcterms:rights rdf:resource="http://creativecommons.org/licenses/by/4.0/"/>
<bibo:uri rdf:resource="https://kops.uni-konstanz.de/handle/123456789/74737"/>
</rdf:Description>
</rdf:RDF>