Publikation:

Memory Footprint Reduction of Cloud Databases with Automated Physical Database Design

Lade...
Vorschaubild

Dateien

Brendle_2-274v8wucp6uu2.pdf
Brendle_2-274v8wucp6uu2.pdfGröße: 4.78 MBDownloads: 201

Datum

2022

Autor:innen

Herausgeber:innen

Kontakt

ISSN der Zeitschrift

Electronic ISSN

ISBN

Bibliografische Daten

Verlag

Schriftenreihe

Auflagebezeichnung

DOI (zitierfähiger Link)
ArXiv-ID

Internationale Patentnummer

Angaben zur Forschungsförderung

Projekt

Open Access-Veröffentlichung
Open Access Green
Core Facility der Universität Konstanz

Gesperrt bis

Titel in einer weiteren Sprache

Publikationstyp
Dissertation
Publikationsstatus
Published

Erschienen in

Zusammenfassung

Enterprises increasingly move their application data into the cloud by employing data management offerings of database-as-a-service providers, who specialize in hosting and managing database instances. While being obligated to certain performance commitments stipulated in service-level agreements (SLAs) with the customer, database-as-a-service providers are incentivized to minimize internal costs to enhance profitability. Since database workloads are often skewed and DRAM constitutes the primary driver of hardware costs, internal costs can be reduced by evicting rarely accessed (cold) data to cheaper storage layers. Hence, only frequently accessed (hot) data should remain in DRAM. As most database management systems employ a buffer manager to load and evict data at page granularity from secondary storage to DRAM and vice versa, keeping only disk pages with hot data in DRAM can lead to substantial cost savings. The physical database schema, however, is usually not defined according to the data access pattern. Therefore, cold data may be stored on the same disk page as hot data. As we will elaborate, polluting the buffer pool with cold data wastes DRAM capacities, which offers a largely untapped cost reduction potential to database-as-a-service providers. In this dissertation, we aim at utilizing the buffer manager more efficiently by recommending a physical database schema in which hot disk pages contain mainly hot data. Our approach is based on the observation that, in most cases, rows of a table are accessed either frequently or rarely according to a value range of a specific column of that table. In order to identify hot and cold data, we introduce a statistics collector that gathers accurate workload statistics with low memory and runtime overhead. By exploiting the collected statistics, our table partitioning advisor proposes a range partitioning layout by grouping rows into partitions belonging to hot- or cold-classified value ranges. As a result, hot range partitions with a high density of hot data stay in DRAM, whereas cold range partitions are evicted to cheaper storage layers. This prevents the pollution of the buffer pool with cold data. In addition, we periodically optimize the physical schema in light of workload changes over time. We propose a forward-looking approach by developing a workload predictor, which forecasts the approximate future workload. The predicted workload is then fed into our table partitioning advisor. Finally, we implement our ideas into a prototype of a commercial database and showcase its applicability by incorporating a real-world database workload. Our experimental evaluation demonstrates a buffer pool size reduction of up to 3.2x compared to related approaches while still adhering to SLAs.

Zusammenfassung in einer weiteren Sprache

Fachgebiet (DDC)
004 Informatik

Schlagwörter

automated physical database design, cloud databases, memory footprint reduction, statistics collector, table partitioning advisor, workload predictor

Konferenz

Rezension
undefined / . - undefined, undefined

Forschungsvorhaben

Organisationseinheiten

Zeitschriftenheft

Zugehörige Datensätze in KOPS

Zitieren

ISO 690BRENDLE, Michael, 2022. Memory Footprint Reduction of Cloud Databases with Automated Physical Database Design [Dissertation]. Konstanz: University of Konstanz
BibTex
@phdthesis{Brendle2022Memor-66195,
  year={2022},
  title={Memory Footprint Reduction of Cloud Databases with Automated Physical Database Design},
  author={Brendle, Michael},
  address={Konstanz},
  school={Universität Konstanz}
}
RDF
<rdf:RDF
    xmlns:dcterms="http://purl.org/dc/terms/"
    xmlns:dc="http://purl.org/dc/elements/1.1/"
    xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
    xmlns:bibo="http://purl.org/ontology/bibo/"
    xmlns:dspace="http://digital-repositories.org/ontologies/dspace/0.1.0#"
    xmlns:foaf="http://xmlns.com/foaf/0.1/"
    xmlns:void="http://rdfs.org/ns/void#"
    xmlns:xsd="http://www.w3.org/2001/XMLSchema#" > 
  <rdf:Description rdf:about="https://kops.uni-konstanz.de/server/rdf/resource/123456789/66195">
    <dc:date rdf:datatype="http://www.w3.org/2001/XMLSchema#dateTime">2023-02-23T08:02:04Z</dc:date>
    <dspace:hasBitstream rdf:resource="https://kops.uni-konstanz.de/bitstream/123456789/66195/3/Brendle_2-274v8wucp6uu2.pdf"/>
    <foaf:homepage rdf:resource="http://localhost:8080/"/>
    <dcterms:hasPart rdf:resource="https://kops.uni-konstanz.de/bitstream/123456789/66195/3/Brendle_2-274v8wucp6uu2.pdf"/>
    <dc:creator>Brendle, Michael</dc:creator>
    <dcterms:rights rdf:resource="https://rightsstatements.org/page/InC/1.0/"/>
    <dc:contributor>Brendle, Michael</dc:contributor>
    <dspace:isPartOfCollection rdf:resource="https://kops.uni-konstanz.de/server/rdf/resource/123456789/36"/>
    <void:sparqlEndpoint rdf:resource="http://localhost/fuseki/dspace/sparql"/>
    <dcterms:abstract xml:lang="eng">Enterprises increasingly move their application data into the cloud by employing data management offerings of database-as-a-service providers, who specialize in hosting and managing database instances. While being obligated to certain performance commitments stipulated in service-level agreements (SLAs) with the customer, database-as-a-service providers are incentivized to minimize internal costs to enhance profitability. Since database workloads are often skewed and DRAM constitutes the primary driver of hardware costs, internal costs can be reduced by evicting rarely accessed (cold) data to cheaper storage layers. Hence, only frequently accessed (hot) data should remain in DRAM. As most database management systems employ a buffer manager to load and evict data at page granularity from secondary storage to DRAM and vice versa, keeping only disk pages with hot data in DRAM can lead to substantial cost savings. The physical database schema, however, is usually not defined according to the data access pattern. Therefore, cold data may be stored on the same disk page as hot data. As we will elaborate, polluting the buffer pool with cold data wastes DRAM capacities, which offers a largely untapped cost reduction potential to database-as-a-service providers. In this dissertation, we aim at utilizing the buffer manager more efficiently by recommending a physical database schema in which hot disk pages contain mainly hot data. Our approach is based on the observation that, in most cases, rows of a table are accessed either frequently or rarely according to a value range of a specific column of that table. In order to identify hot and cold data, we introduce a statistics collector that gathers accurate workload statistics with low memory and runtime overhead. By exploiting the collected statistics, our table partitioning advisor proposes a range partitioning layout by grouping rows into partitions belonging to hot- or cold-classified value ranges. As a result, hot range partitions with a high density of hot data stay in DRAM, whereas cold range partitions are evicted to cheaper storage layers. This prevents the pollution of the buffer pool with cold data. In addition, we periodically optimize the physical schema in light of workload changes over time. We propose a forward-looking approach by developing a workload predictor, which forecasts the approximate future workload. The predicted workload is then fed into our table partitioning advisor. Finally, we implement our ideas into a prototype of a commercial database and showcase its applicability by incorporating a real-world database workload. Our experimental evaluation demonstrates a buffer pool size reduction of up to 3.2x compared to related approaches while still adhering to SLAs.</dcterms:abstract>
    <dc:language>eng</dc:language>
    <dcterms:available rdf:datatype="http://www.w3.org/2001/XMLSchema#dateTime">2023-02-23T08:02:04Z</dcterms:available>
    <dcterms:isPartOf rdf:resource="https://kops.uni-konstanz.de/server/rdf/resource/123456789/36"/>
    <dcterms:title>Memory Footprint Reduction of Cloud Databases with Automated Physical Database Design</dcterms:title>
    <dcterms:issued>2022</dcterms:issued>
    <bibo:uri rdf:resource="https://kops.uni-konstanz.de/handle/123456789/66195"/>
    <dc:rights>terms-of-use</dc:rights>
  </rdf:Description>
</rdf:RDF>

Interner Vermerk

xmlui.Submission.submit.DescribeStep.inputForms.label.kops_note_fromSubmitter

Kontakt
URL der Originalveröffentl.

Prüfdatum der URL

Prüfungsdatum der Dissertation

July 29, 2022
Hochschulschriftenvermerk
Konstanz, Univ., Diss., 2022
Finanzierungsart

Kommentar zur Publikation

Allianzlizenz
Corresponding Authors der Uni Konstanz vorhanden
Internationale Co-Autor:innen
Universitätsbibliographie
Ja
Begutachtet
Diese Publikation teilen