Publikation:

Coracle : a Machine Learning Framework to Identify Bacteria Associated with Continuous Variables

Lade...
Vorschaubild

Dateien

staab_2-1qjmi1jk397fx9.PDF
staab_2-1qjmi1jk397fx9.PDFGröße: 621.39 KBDownloads: 6

Datum

2024

Autor:innen

Herausgeber:innen

Kontakt

ISSN der Zeitschrift

Electronic ISSN

ISBN

Bibliografische Daten

Verlag

Schriftenreihe

Auflagebezeichnung

ArXiv-ID

Internationale Patentnummer

Link zur Lizenz

Angaben zur Forschungsförderung

Projekt

Open Access-Veröffentlichung
Open Access Gold
Core Facility der Universität Konstanz

Gesperrt bis

Titel in einer weiteren Sprache

Publikationstyp
Zeitschriftenartikel
Publikationsstatus
Published

Erschienen in

Bioinformatics. Oxford University Press (OUP). 2024, 40(1), btad749. ISSN 1367-4803. eISSN 1367-4811. Available under: doi: 10.1093/bioinformatics/btad749

Zusammenfassung

We present Coracle, an Artificial Intelligence (AI) framework that can identify associations between bacterial communities and continuous variables. Coracle uses an ensemble approach of prominent feature selection methods and machine learning (ML) models to identify features, i.e., bacteria, associated with a continuous variable, e.g. host thermal tolerance. The results are aggregated into a score that incorporates the performances of the different ML models and the respective feature importance, while also considering the robustness of feature selection. Additionally, regression coefficients provide first insights into the direction of the association. We show the utility of Coracle by analyzing associations between bacterial composition data (i.e., 16S rRNA Amplicon Sequence Variants, ASVs) and coral thermal tolerance (i.e., standardized short-term heat stress-derived diagnostics). This analysis identified high-scoring bacterial taxa that were previously found associated with coral thermal tolerance. Coracle scales with feature number and performs well with hundreds to thousands of features, corresponding to the typical size of current datasets. Coracle performs best if run at a higher taxonomic level first (e.g., order or family) to identify groups of interest that can subsequently be run at the ASV level.

Zusammenfassung in einer weiteren Sprache

Fachgebiet (DDC)
570 Biowissenschaften, Biologie

Schlagwörter

Konferenz

Rezension
undefined / . - undefined, undefined

Forschungsvorhaben

Organisationseinheiten

Zeitschriftenheft

Zugehörige Datensätze in KOPS

Zitieren

ISO 690STAAB, Sebastian, Anny CÁRDENAS, Raquel S. PEIXOTO, Falk SCHREIBER, Christian R. VOOLSTRA, 2024. Coracle : a Machine Learning Framework to Identify Bacteria Associated with Continuous Variables. In: Bioinformatics. Oxford University Press (OUP). 2024, 40(1), btad749. ISSN 1367-4803. eISSN 1367-4811. Available under: doi: 10.1093/bioinformatics/btad749
BibTex
@article{Staab2024Corac-68931,
  year={2024},
  doi={10.1093/bioinformatics/btad749},
  title={Coracle : a Machine Learning Framework to Identify Bacteria Associated with Continuous Variables},
  number={1},
  volume={40},
  issn={1367-4803},
  journal={Bioinformatics},
  author={Staab, Sebastian and Cárdenas, Anny and Peixoto, Raquel S. and Schreiber, Falk and Voolstra, Christian R.},
  note={Article Number: btad749}
}
RDF
<rdf:RDF
    xmlns:dcterms="http://purl.org/dc/terms/"
    xmlns:dc="http://purl.org/dc/elements/1.1/"
    xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
    xmlns:bibo="http://purl.org/ontology/bibo/"
    xmlns:dspace="http://digital-repositories.org/ontologies/dspace/0.1.0#"
    xmlns:foaf="http://xmlns.com/foaf/0.1/"
    xmlns:void="http://rdfs.org/ns/void#"
    xmlns:xsd="http://www.w3.org/2001/XMLSchema#" > 
  <rdf:Description rdf:about="https://kops.uni-konstanz.de/server/rdf/resource/123456789/68931">
    <dspace:isPartOfCollection rdf:resource="https://kops.uni-konstanz.de/server/rdf/resource/123456789/36"/>
    <dcterms:abstract>We present Coracle, an Artificial Intelligence (AI) framework that can identify associations between bacterial communities and continuous variables. Coracle uses an ensemble approach of prominent feature selection methods and machine learning (ML) models to identify features, i.e., bacteria, associated with a continuous variable, e.g. host thermal tolerance. The results are aggregated into a score that incorporates the performances of the different ML models and the respective feature importance, while also considering the robustness of feature selection. Additionally, regression coefficients provide first insights into the direction of the association. We show the utility of Coracle by analyzing associations between bacterial composition data (i.e., 16S rRNA Amplicon Sequence Variants, ASVs) and coral thermal tolerance (i.e., standardized short-term heat stress-derived diagnostics). This analysis identified high-scoring bacterial taxa that were previously found associated with coral thermal tolerance. Coracle scales with feature number and performs well with hundreds to thousands of features, corresponding to the typical size of current datasets. Coracle performs best if run at a higher taxonomic level first (e.g., order or family) to identify groups of interest that can subsequently be run at the ASV level.</dcterms:abstract>
    <bibo:uri rdf:resource="https://kops.uni-konstanz.de/handle/123456789/68931"/>
    <dcterms:issued>2024</dcterms:issued>
    <void:sparqlEndpoint rdf:resource="http://localhost/fuseki/dspace/sparql"/>
    <dc:contributor>Voolstra, Christian R.</dc:contributor>
    <dcterms:title>Coracle : a Machine Learning Framework to Identify Bacteria Associated with Continuous Variables</dcterms:title>
    <dc:creator>Staab, Sebastian</dc:creator>
    <dc:creator>Peixoto, Raquel S.</dc:creator>
    <dc:contributor>Staab, Sebastian</dc:contributor>
    <dc:rights>Attribution 4.0 International</dc:rights>
    <dc:contributor>Peixoto, Raquel S.</dc:contributor>
    <dcterms:rights rdf:resource="http://creativecommons.org/licenses/by/4.0/"/>
    <dspace:isPartOfCollection rdf:resource="https://kops.uni-konstanz.de/server/rdf/resource/123456789/28"/>
    <dcterms:isPartOf rdf:resource="https://kops.uni-konstanz.de/server/rdf/resource/123456789/36"/>
    <dcterms:available rdf:datatype="http://www.w3.org/2001/XMLSchema#dateTime">2024-01-04T13:17:46Z</dcterms:available>
    <dc:date rdf:datatype="http://www.w3.org/2001/XMLSchema#dateTime">2024-01-04T13:17:46Z</dc:date>
    <dc:contributor>Schreiber, Falk</dc:contributor>
    <foaf:homepage rdf:resource="http://localhost:8080/"/>
    <dc:creator>Cárdenas, Anny</dc:creator>
    <dc:language>eng</dc:language>
    <dc:creator>Voolstra, Christian R.</dc:creator>
    <dc:creator>Schreiber, Falk</dc:creator>
    <dc:contributor>Cárdenas, Anny</dc:contributor>
    <dcterms:isPartOf rdf:resource="https://kops.uni-konstanz.de/server/rdf/resource/123456789/28"/>
    <dspace:hasBitstream rdf:resource="https://kops.uni-konstanz.de/bitstream/123456789/68931/1/staab_2-1qjmi1jk397fx9.PDF"/>
    <dcterms:hasPart rdf:resource="https://kops.uni-konstanz.de/bitstream/123456789/68931/1/staab_2-1qjmi1jk397fx9.PDF"/>
  </rdf:Description>
</rdf:RDF>

Interner Vermerk

xmlui.Submission.submit.DescribeStep.inputForms.label.kops_note_fromSubmitter

Kontakt
URL der Originalveröffentl.

Prüfdatum der URL

Prüfungsdatum der Dissertation

Finanzierungsart

Kommentar zur Publikation

Allianzlizenz
Corresponding Authors der Uni Konstanz vorhanden
Internationale Co-Autor:innen
Universitätsbibliographie
Ja
Begutachtet
Ja
Diese Publikation teilen