Publikation:

UniCor and UniCorP : A Novel Metric and Hierarchical Feature Selection Algorithm for Microbial Community Analysis

Lade...
Vorschaubild

Dateien

Staab_2-53t7iq6v4lk11.pdf
Staab_2-53t7iq6v4lk11.pdfGröße: 653.74 KBDownloads: 1

Datum

2025

Herausgeber:innen

Kontakt

ISSN der Zeitschrift

Electronic ISSN

ISBN

Bibliografische Daten

Verlag

Schriftenreihe

Auflagebezeichnung

ArXiv-ID

Internationale Patentnummer

Link zur Lizenz

Angaben zur Forschungsförderung

Deutsche Forschungsgemeinschaft (DFG): 458901010

Projekt

Open Access-Veröffentlichung
Open Access Gold
Core Facility der Universität Konstanz

Gesperrt bis

Titel in einer weiteren Sprache

Publikationstyp
Zeitschriftenartikel
Publikationsstatus
Published

Erschienen in

ISME Communications. Oxford University Press (OUP). 2025, 5(1), ycaf174. eISSN 2730-6151. Verfügbar unter: doi: 10.1093/ismeco/ycaf174

Zusammenfassung

The rapid advancement of technologies and methods in the life sciences has significantly increased the availability of big data, presenting new challenges for its analysis. Microbiome datasets, in particular, are characterized by extensive feature sets with defined but complex hierarchical structures that are often overlooked or underutilized. Here we introduce a novel metric, UniCor, to identify UNIquely CORrelated eNtities (UNICORNs) in quantitative datasets associated with continuous target variables. These datasets may include microbiome community structures in relation to environmental factors (e.g., temperature, pH, salinity) or biotic variables (e.g., thermal tolerance, oxidative stress). The UniCor metric combines the uniqueness of a given feature within a dataset with its correlation to a target variable of interest. To further enhance its utility, we developed a propagation algorithm (UniCorP), which exploits inherent dataset hierarchies, such as taxonomic levels in microbiome datasets, by selecting and propagating features based on their UniCor metric. Using bacterial community datasets with hierarchical taxonomic annotations and various continuous environmental variables, we demonstrate the ability of the novel metric to reduce features and increase predictive performance in cross-validated Random Forest Regressions (RFR). After propagating features with UniCorP and enriching the hierarchical levels with UNICORNs, the predictive performance consistently outperformed control trials for taxonomic aggregation, even at the least granular hierarchical level, allowing a substantial reduction of the feature space. We also compared the metric to existing methods for feature aggregation, showing that it offers stable, competitive predictive performance and feature reduction, within a simple and adaptable framework.

Zusammenfassung in einer weiteren Sprache

Fachgebiet (DDC)
570 Biowissenschaften, Biologie

Schlagwörter

Hierarchy, Feature Selection, Feature Aggregation, Machine Learning, Artificial Intelligence, Propagation, Microbiome, Taxonomy, Correlation, Metric

Konferenz

Rezension
undefined / . - undefined, undefined

Forschungsvorhaben

Organisationseinheiten

Zeitschriftenheft

Zugehörige Datensätze in KOPS

Datensatz
Tara Pacific 16S rRNA data analysis release
(2022) Ruscheweyh, Hans-Joachim; Salazar, Guillem; Poulain, Julie; Belser, Caroline; Clayssen, Quentin; Hume, Benjamin C. C.; Boissin, Emilie; Galand, Pierre E.; Pesant, Stéphane; Lombard, Fabien; Armstrong, Eric; Lang Yona, Naama; Klinges, Grace; McMinds, Ryan; Henry, Nicolas; Vega Thurber, Rebecca; Moulin, Clémentine; Agostini, Sylvain; Banaigs, Bernard; Boss, Emmanuel; Bowler, Chris; de Vargas, Colomban; Douville, Eric; Flores, J. Michel; Forcioli, Didier; Furla, Paola; Gilson, Eric; Reynaud, Stéphanie; Sullivan, Matthew B.; Thomas, Olivier; Troublé, Romain; Zoccola, Didier; Planes, Serge; Allemand, Denis; Voolstra, Christian R.; Wincker, Patrick; Sunagawa, Shinichi

Zitieren

ISO 690STAAB, Sebastian, Kim-Isabelle MAYER, Anny CÁRDENAS, Raquel PEIXOTO, Falk SCHREIBER, Christian R. VOOLSTRA, 2025. UniCor and UniCorP : A Novel Metric and Hierarchical Feature Selection Algorithm for Microbial Community Analysis. In: ISME Communications. Oxford University Press (OUP). 2025, 5(1), ycaf174. eISSN 2730-6151. Verfügbar unter: doi: 10.1093/ismeco/ycaf174
BibTex
@article{Staab2025-01-17UniCo-74737,
  title={UniCor and UniCorP : A Novel Metric and Hierarchical Feature Selection Algorithm for Microbial Community Analysis},
  year={2025},
  doi={10.1093/ismeco/ycaf174},
  number={1},
  volume={5},
  journal={ISME Communications},
  author={Staab, Sebastian and Mayer, Kim-Isabelle and Cárdenas, Anny and Peixoto, Raquel and Schreiber, Falk and Voolstra, Christian R.},
  note={Article Number: ycaf174}
}
RDF
<rdf:RDF
    xmlns:dcterms="http://purl.org/dc/terms/"
    xmlns:dc="http://purl.org/dc/elements/1.1/"
    xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
    xmlns:bibo="http://purl.org/ontology/bibo/"
    xmlns:dspace="http://digital-repositories.org/ontologies/dspace/0.1.0#"
    xmlns:foaf="http://xmlns.com/foaf/0.1/"
    xmlns:void="http://rdfs.org/ns/void#"
    xmlns:xsd="http://www.w3.org/2001/XMLSchema#" > 
  <rdf:Description rdf:about="https://kops.uni-konstanz.de/server/rdf/resource/123456789/74737">
    <foaf:homepage rdf:resource="http://localhost:8080/"/>
    <dcterms:title>UniCor and UniCorP : A Novel Metric and Hierarchical Feature Selection Algorithm for Microbial Community Analysis</dcterms:title>
    <dc:creator>Voolstra, Christian R.</dc:creator>
    <dc:creator>Staab, Sebastian</dc:creator>
    <dc:creator>Schreiber, Falk</dc:creator>
    <dc:contributor>Cárdenas, Anny</dc:contributor>
    <dcterms:abstract>The rapid advancement of technologies and methods in the life sciences has significantly increased the availability of big data, presenting new challenges for its analysis. Microbiome datasets, in particular, are characterized by extensive feature sets with defined but complex hierarchical structures that are often overlooked or underutilized. Here we introduce a novel metric, UniCor, to identify UNIquely CORrelated eNtities (UNICORNs) in quantitative datasets associated with continuous target variables. These datasets may include microbiome community structures in relation to environmental factors (e.g., temperature, pH, salinity) or biotic variables (e.g., thermal tolerance, oxidative stress). The UniCor metric combines the uniqueness of a given feature within a dataset with its correlation to a target variable of interest. To further enhance its utility, we developed a propagation algorithm (UniCorP), which exploits inherent dataset hierarchies, such as taxonomic levels in microbiome datasets, by selecting and propagating features based on their UniCor metric. Using bacterial community datasets with hierarchical taxonomic annotations and various continuous environmental variables, we demonstrate the ability of the novel metric to reduce features and increase predictive performance in cross-validated Random Forest Regressions (RFR). After propagating features with UniCorP and enriching the hierarchical levels with UNICORNs, the predictive performance consistently outperformed control trials for taxonomic aggregation, even at the least granular hierarchical level, allowing a substantial reduction of the feature space. We also compared the metric to existing methods for feature aggregation, showing that it offers stable, competitive predictive performance and feature reduction, within a simple and adaptable framework.</dcterms:abstract>
    <dc:contributor>Mayer, Kim-Isabelle</dc:contributor>
    <dc:creator>Cárdenas, Anny</dc:creator>
    <dspace:isPartOfCollection rdf:resource="https://kops.uni-konstanz.de/server/rdf/resource/123456789/28"/>
    <dcterms:issued>2025-01-17</dcterms:issued>
    <dc:contributor>Schreiber, Falk</dc:contributor>
    <void:sparqlEndpoint rdf:resource="http://localhost/fuseki/dspace/sparql"/>
    <dcterms:isPartOf rdf:resource="https://kops.uni-konstanz.de/server/rdf/resource/123456789/36"/>
    <dc:contributor>Peixoto, Raquel</dc:contributor>
    <dc:contributor>Staab, Sebastian</dc:contributor>
    <dc:date rdf:datatype="http://www.w3.org/2001/XMLSchema#dateTime">2025-10-08T11:34:10Z</dc:date>
    <dspace:hasBitstream rdf:resource="https://kops.uni-konstanz.de/bitstream/123456789/74737/1/Staab_2-53t7iq6v4lk11.pdf"/>
    <dc:language>eng</dc:language>
    <dcterms:available rdf:datatype="http://www.w3.org/2001/XMLSchema#dateTime">2025-10-08T11:34:10Z</dcterms:available>
    <dc:contributor>Voolstra, Christian R.</dc:contributor>
    <dc:rights>Attribution 4.0 International</dc:rights>
    <dspace:isPartOfCollection rdf:resource="https://kops.uni-konstanz.de/server/rdf/resource/123456789/36"/>
    <dcterms:hasPart rdf:resource="https://kops.uni-konstanz.de/bitstream/123456789/74737/1/Staab_2-53t7iq6v4lk11.pdf"/>
    <dc:creator>Peixoto, Raquel</dc:creator>
    <dcterms:isPartOf rdf:resource="https://kops.uni-konstanz.de/server/rdf/resource/123456789/28"/>
    <dc:creator>Mayer, Kim-Isabelle</dc:creator>
    <dcterms:rights rdf:resource="http://creativecommons.org/licenses/by/4.0/"/>
    <bibo:uri rdf:resource="https://kops.uni-konstanz.de/handle/123456789/74737"/>
  </rdf:Description>
</rdf:RDF>

Interner Vermerk

xmlui.Submission.submit.DescribeStep.inputForms.label.kops_note_fromSubmitter

Kontakt
URL der Originalveröffentl.

Prüfdatum der URL

Prüfungsdatum der Dissertation

Finanzierungsart

Kommentar zur Publikation

Allianzlizenz
Corresponding Authors der Uni Konstanz vorhanden
Internationale Co-Autor:innen
Universitätsbibliographie
Ja
Begutachtet
Nein
Diese Publikation teilen