Publikation:

Generation of SNP datasets for orangutan population genomics using improved reduced-representation sequencing and direct comparisons of SNP calling algorithms

Lade...
Vorschaubild

Dateien

Greminger_2-qgytkvtpbw1y2.pdf
Greminger_2-qgytkvtpbw1y2.pdfGröße: 1.16 MBDownloads: 173

Datum

2014

Autor:innen

Greminger, Maja P.
Stölting, Kai N.
Goossens, Benoit
Arora, Natasha
Bruggmann, Rémy
Patrignani, Andrea
Nussberger, Beatrice
Sharma, Reeta
et al.

Herausgeber:innen

Kontakt

ISSN der Zeitschrift

Electronic ISSN

ISBN

Bibliografische Daten

Verlag

Schriftenreihe

Auflagebezeichnung

ArXiv-ID

Internationale Patentnummer

Angaben zur Forschungsförderung

Projekt

Open Access-Veröffentlichung
Open Access Gold
Core Facility der Universität Konstanz

Gesperrt bis

Titel in einer weiteren Sprache

Publikationstyp
Zeitschriftenartikel
Publikationsstatus
Published

Erschienen in

BMC genomics. BioMed Central. 2014, 15, 16. eISSN 1471-2164. Available under: doi: 10.1186/1471-2164-15-16

Zusammenfassung

Background: High-throughput sequencing has opened up exciting possibilities in population and conservation genetics by enabling the assessment of genetic variation at genome-wide scales. One approach to reduce genome complexity, i.e. investigating only parts of the genome, is reduced-representation library (RRL) sequencing. Like similar approaches, RRL sequencing reduces ascertainment bias due to simultaneous discovery and genotyping of single-nucleotide polymorphisms (SNPs) and does not require reference genomes. Yet, generating such datasets remains challenging due to laboratory and bioinformatical issues. In the laboratory, current protocols require improvements with regards to sequencing homologous fragments to reduce the number of missing genotypes. From the bioinformatical perspective, the reliance of most studies on a single SNP caller disregards the possibility that different algorithms may produce disparate SNP datasets.

Results: We present an improved RRL (iRRL) protocol that maximizes the generation of homologous DNA sequences, thus achieving improved genotyping-by-sequencing efficiency. Our modifications facilitate generation of single-sample libraries, enabling individual genotype assignments instead of pooled-sample analysis. We sequenced ~1% of the orangutan genome with 41-fold median coverage in 31 wild-born individuals from two populations. SNPs and genotypes were called using three different algorithms. We obtained substantially different SNP datasets depending on the SNP caller. Genotype validations revealed that the Unified Genotyper of the Genome Analysis Toolkit and SAMtools performed significantly better than a caller from CLC Genomics Workbench (CLC). Of all conflicting genotype calls, CLC was only correct in 17% of the cases. Furthermore, conflicting genotypes between two algorithms showed a systematic bias in that one caller almost exclusively assigned heterozygotes, while the other one almost exclusively assigned homozygotes.

Conclusions: Our enhanced iRRL approach greatly facilitates genotyping-by-sequencing and thus direct estimates of allele frequencies. Our direct comparison of three commonly used SNP callers emphasizes the need to question the accuracy of SNP and genotype calling, as we obtained considerably different SNP datasets depending on caller algorithms, sequencing depths and filtering criteria. These differences affected scans for signatures of natural selection, but will also exert undue influences on demographic inferences. This study presents the first effort to generate a population genomic dataset for wild-born orangutans with known population provenance.

Zusammenfassung in einer weiteren Sprache

Fachgebiet (DDC)
570 Biowissenschaften, Biologie

Schlagwörter

Next-generation sequencing, Single-nucleotide polymorphisms, Reduced-representation libraries, Bioinformatics,GATK, SAMtools, CLC genomics workbench, Great apes

Konferenz

Rezension
undefined / . - undefined, undefined

Forschungsvorhaben

Organisationseinheiten

Zeitschriftenheft

Zugehörige Datensätze in KOPS

Zitieren

ISO 690GREMINGER, Maja P., Kai N. STÖLTING, Alexander NATER, Benoit GOOSSENS, Natasha ARORA, Rémy BRUGGMANN, Andrea PATRIGNANI, Beatrice NUSSBERGER, Reeta SHARMA, Robert KRAUS, 2014. Generation of SNP datasets for orangutan population genomics using improved reduced-representation sequencing and direct comparisons of SNP calling algorithms. In: BMC genomics. BioMed Central. 2014, 15, 16. eISSN 1471-2164. Available under: doi: 10.1186/1471-2164-15-16
BibTex
@article{Greminger2014-01-10Gener-50839,
  year={2014},
  doi={10.1186/1471-2164-15-16},
  title={Generation of SNP datasets for orangutan population genomics using improved reduced-representation sequencing and direct comparisons of SNP calling algorithms},
  volume={15},
  journal={BMC genomics},
  author={Greminger, Maja P. and Stölting, Kai N. and Nater, Alexander and Goossens, Benoit and Arora, Natasha and Bruggmann, Rémy and Patrignani, Andrea and Nussberger, Beatrice and Sharma, Reeta and Kraus, Robert},
  note={Article Number: 16}
}
RDF
<rdf:RDF
    xmlns:dcterms="http://purl.org/dc/terms/"
    xmlns:dc="http://purl.org/dc/elements/1.1/"
    xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
    xmlns:bibo="http://purl.org/ontology/bibo/"
    xmlns:dspace="http://digital-repositories.org/ontologies/dspace/0.1.0#"
    xmlns:foaf="http://xmlns.com/foaf/0.1/"
    xmlns:void="http://rdfs.org/ns/void#"
    xmlns:xsd="http://www.w3.org/2001/XMLSchema#" > 
  <rdf:Description rdf:about="https://kops.uni-konstanz.de/server/rdf/resource/123456789/50839">
    <dc:contributor>Bruggmann, Rémy</dc:contributor>
    <dc:contributor>Goossens, Benoit</dc:contributor>
    <dc:creator>Greminger, Maja P.</dc:creator>
    <dc:contributor>Greminger, Maja P.</dc:contributor>
    <dcterms:abstract xml:lang="eng">Background: High-throughput sequencing has opened up exciting possibilities in population and conservation genetics by enabling the assessment of genetic variation at genome-wide scales. One approach to reduce genome complexity, i.e. investigating only parts of the genome, is reduced-representation library (RRL) sequencing. Like similar approaches, RRL sequencing reduces ascertainment bias due to simultaneous discovery and genotyping of single-nucleotide polymorphisms (SNPs) and does not require reference genomes. Yet, generating such datasets remains challenging due to laboratory and bioinformatical issues. In the laboratory, current protocols require improvements with regards to sequencing homologous fragments to reduce the number of missing genotypes. From the bioinformatical perspective, the reliance of most studies on a single SNP caller disregards the possibility that different algorithms may produce disparate SNP datasets.&lt;br /&gt;&lt;br /&gt;Results: We present an improved RRL (iRRL) protocol that maximizes the generation of homologous DNA sequences, thus achieving improved genotyping-by-sequencing efficiency. Our modifications facilitate generation of single-sample libraries, enabling individual genotype assignments instead of pooled-sample analysis. We sequenced ~1% of the orangutan genome with 41-fold median coverage in 31 wild-born individuals from two populations. SNPs and genotypes were called using three different algorithms. We obtained substantially different SNP datasets depending on the SNP caller. Genotype validations revealed that the Unified Genotyper of the Genome Analysis Toolkit and SAMtools performed significantly better than a caller from CLC Genomics Workbench (CLC). Of all conflicting genotype calls, CLC was only correct in 17% of the cases. Furthermore, conflicting genotypes between two algorithms showed a systematic bias in that one caller almost exclusively assigned heterozygotes, while the other one almost exclusively assigned homozygotes.&lt;br /&gt;&lt;br /&gt;Conclusions: Our enhanced iRRL approach greatly facilitates genotyping-by-sequencing and thus direct estimates of allele frequencies. Our direct comparison of three commonly used SNP callers emphasizes the need to question the accuracy of SNP and genotype calling, as we obtained considerably different SNP datasets depending on caller algorithms, sequencing depths and filtering criteria. These differences affected scans for signatures of natural selection, but will also exert undue influences on demographic inferences. This study presents the first effort to generate a population genomic dataset for wild-born orangutans with known population provenance.</dcterms:abstract>
    <dc:contributor>Arora, Natasha</dc:contributor>
    <dcterms:hasPart rdf:resource="https://kops.uni-konstanz.de/bitstream/123456789/50839/3/Greminger_2-qgytkvtpbw1y2.pdf"/>
    <dc:creator>Sharma, Reeta</dc:creator>
    <dc:creator>Bruggmann, Rémy</dc:creator>
    <dc:creator>Patrignani, Andrea</dc:creator>
    <dcterms:rights rdf:resource="http://creativecommons.org/licenses/by/2.0/"/>
    <dc:language>eng</dc:language>
    <dc:rights>Attribution 2.0 Generic</dc:rights>
    <bibo:uri rdf:resource="https://kops.uni-konstanz.de/handle/123456789/50839"/>
    <dc:creator>Nater, Alexander</dc:creator>
    <dc:creator>Goossens, Benoit</dc:creator>
    <dcterms:available rdf:datatype="http://www.w3.org/2001/XMLSchema#dateTime">2020-09-15T09:30:53Z</dcterms:available>
    <dc:contributor>Patrignani, Andrea</dc:contributor>
    <dcterms:issued>2014-01-10</dcterms:issued>
    <dspace:hasBitstream rdf:resource="https://kops.uni-konstanz.de/bitstream/123456789/50839/3/Greminger_2-qgytkvtpbw1y2.pdf"/>
    <dc:contributor>Stölting, Kai N.</dc:contributor>
    <dc:creator>Nussberger, Beatrice</dc:creator>
    <dcterms:title>Generation of SNP datasets for orangutan population genomics using improved reduced-representation sequencing and direct comparisons of SNP calling algorithms</dcterms:title>
    <dc:creator>Arora, Natasha</dc:creator>
    <dc:contributor>Nater, Alexander</dc:contributor>
    <foaf:homepage rdf:resource="http://localhost:8080/"/>
    <dc:contributor>Kraus, Robert</dc:contributor>
    <dc:contributor>Nussberger, Beatrice</dc:contributor>
    <dspace:isPartOfCollection rdf:resource="https://kops.uni-konstanz.de/server/rdf/resource/123456789/28"/>
    <dc:creator>Stölting, Kai N.</dc:creator>
    <dcterms:isPartOf rdf:resource="https://kops.uni-konstanz.de/server/rdf/resource/123456789/28"/>
    <dc:contributor>Sharma, Reeta</dc:contributor>
    <void:sparqlEndpoint rdf:resource="http://localhost/fuseki/dspace/sparql"/>
    <dc:creator>Kraus, Robert</dc:creator>
    <dc:date rdf:datatype="http://www.w3.org/2001/XMLSchema#dateTime">2020-09-15T09:30:53Z</dc:date>
  </rdf:Description>
</rdf:RDF>

Interner Vermerk

xmlui.Submission.submit.DescribeStep.inputForms.label.kops_note_fromSubmitter

Kontakt
URL der Originalveröffentl.

Prüfdatum der URL

Prüfungsdatum der Dissertation

Finanzierungsart

Kommentar zur Publikation

Allianzlizenz
Corresponding Authors der Uni Konstanz vorhanden
Internationale Co-Autor:innen
Universitätsbibliographie
Begutachtet
Ja
Diese Publikation teilen