Publikation: Memory Footprint Reduction of Cloud Databases with Automated Physical Database Design
Dateien
Datum
Autor:innen
Herausgeber:innen
ISSN der Zeitschrift
Electronic ISSN
ISBN
Bibliografische Daten
Verlag
Schriftenreihe
Auflagebezeichnung
URI (zitierfähiger Link)
Internationale Patentnummer
Link zur Lizenz
Angaben zur Forschungsförderung
Projekt
Open Access-Veröffentlichung
Core Facility der Universität Konstanz
Titel in einer weiteren Sprache
Publikationstyp
Publikationsstatus
Erschienen in
Zusammenfassung
Enterprises increasingly move their application data into the cloud by employing data management offerings of database-as-a-service providers, who specialize in hosting and managing database instances. While being obligated to certain performance commitments stipulated in service-level agreements (SLAs) with the customer, database-as-a-service providers are incentivized to minimize internal costs to enhance profitability. Since database workloads are often skewed and DRAM constitutes the primary driver of hardware costs, internal costs can be reduced by evicting rarely accessed (cold) data to cheaper storage layers. Hence, only frequently accessed (hot) data should remain in DRAM. As most database management systems employ a buffer manager to load and evict data at page granularity from secondary storage to DRAM and vice versa, keeping only disk pages with hot data in DRAM can lead to substantial cost savings. The physical database schema, however, is usually not defined according to the data access pattern. Therefore, cold data may be stored on the same disk page as hot data. As we will elaborate, polluting the buffer pool with cold data wastes DRAM capacities, which offers a largely untapped cost reduction potential to database-as-a-service providers. In this dissertation, we aim at utilizing the buffer manager more efficiently by recommending a physical database schema in which hot disk pages contain mainly hot data. Our approach is based on the observation that, in most cases, rows of a table are accessed either frequently or rarely according to a value range of a specific column of that table. In order to identify hot and cold data, we introduce a statistics collector that gathers accurate workload statistics with low memory and runtime overhead. By exploiting the collected statistics, our table partitioning advisor proposes a range partitioning layout by grouping rows into partitions belonging to hot- or cold-classified value ranges. As a result, hot range partitions with a high density of hot data stay in DRAM, whereas cold range partitions are evicted to cheaper storage layers. This prevents the pollution of the buffer pool with cold data. In addition, we periodically optimize the physical schema in light of workload changes over time. We propose a forward-looking approach by developing a workload predictor, which forecasts the approximate future workload. The predicted workload is then fed into our table partitioning advisor. Finally, we implement our ideas into a prototype of a commercial database and showcase its applicability by incorporating a real-world database workload. Our experimental evaluation demonstrates a buffer pool size reduction of up to 3.2x compared to related approaches while still adhering to SLAs.
Zusammenfassung in einer weiteren Sprache
Fachgebiet (DDC)
Schlagwörter
Konferenz
Rezension
Zitieren
ISO 690
BRENDLE, Michael, 2022. Memory Footprint Reduction of Cloud Databases with Automated Physical Database Design [Dissertation]. Konstanz: University of KonstanzBibTex
@phdthesis{Brendle2022Memor-66195, year={2022}, title={Memory Footprint Reduction of Cloud Databases with Automated Physical Database Design}, author={Brendle, Michael}, address={Konstanz}, school={Universität Konstanz} }
RDF
<rdf:RDF xmlns:dcterms="http://purl.org/dc/terms/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:bibo="http://purl.org/ontology/bibo/" xmlns:dspace="http://digital-repositories.org/ontologies/dspace/0.1.0#" xmlns:foaf="http://xmlns.com/foaf/0.1/" xmlns:void="http://rdfs.org/ns/void#" xmlns:xsd="http://www.w3.org/2001/XMLSchema#" > <rdf:Description rdf:about="https://kops.uni-konstanz.de/server/rdf/resource/123456789/66195"> <dc:date rdf:datatype="http://www.w3.org/2001/XMLSchema#dateTime">2023-02-23T08:02:04Z</dc:date> <dspace:hasBitstream rdf:resource="https://kops.uni-konstanz.de/bitstream/123456789/66195/3/Brendle_2-274v8wucp6uu2.pdf"/> <foaf:homepage rdf:resource="http://localhost:8080/"/> <dcterms:hasPart rdf:resource="https://kops.uni-konstanz.de/bitstream/123456789/66195/3/Brendle_2-274v8wucp6uu2.pdf"/> <dc:creator>Brendle, Michael</dc:creator> <dcterms:rights rdf:resource="https://rightsstatements.org/page/InC/1.0/"/> <dc:contributor>Brendle, Michael</dc:contributor> <dspace:isPartOfCollection rdf:resource="https://kops.uni-konstanz.de/server/rdf/resource/123456789/36"/> <void:sparqlEndpoint rdf:resource="http://localhost/fuseki/dspace/sparql"/> <dcterms:abstract xml:lang="eng">Enterprises increasingly move their application data into the cloud by employing data management offerings of database-as-a-service providers, who specialize in hosting and managing database instances. While being obligated to certain performance commitments stipulated in service-level agreements (SLAs) with the customer, database-as-a-service providers are incentivized to minimize internal costs to enhance profitability. Since database workloads are often skewed and DRAM constitutes the primary driver of hardware costs, internal costs can be reduced by evicting rarely accessed (cold) data to cheaper storage layers. Hence, only frequently accessed (hot) data should remain in DRAM. As most database management systems employ a buffer manager to load and evict data at page granularity from secondary storage to DRAM and vice versa, keeping only disk pages with hot data in DRAM can lead to substantial cost savings. The physical database schema, however, is usually not defined according to the data access pattern. Therefore, cold data may be stored on the same disk page as hot data. As we will elaborate, polluting the buffer pool with cold data wastes DRAM capacities, which offers a largely untapped cost reduction potential to database-as-a-service providers. In this dissertation, we aim at utilizing the buffer manager more efficiently by recommending a physical database schema in which hot disk pages contain mainly hot data. Our approach is based on the observation that, in most cases, rows of a table are accessed either frequently or rarely according to a value range of a specific column of that table. In order to identify hot and cold data, we introduce a statistics collector that gathers accurate workload statistics with low memory and runtime overhead. By exploiting the collected statistics, our table partitioning advisor proposes a range partitioning layout by grouping rows into partitions belonging to hot- or cold-classified value ranges. As a result, hot range partitions with a high density of hot data stay in DRAM, whereas cold range partitions are evicted to cheaper storage layers. This prevents the pollution of the buffer pool with cold data. In addition, we periodically optimize the physical schema in light of workload changes over time. We propose a forward-looking approach by developing a workload predictor, which forecasts the approximate future workload. The predicted workload is then fed into our table partitioning advisor. Finally, we implement our ideas into a prototype of a commercial database and showcase its applicability by incorporating a real-world database workload. Our experimental evaluation demonstrates a buffer pool size reduction of up to 3.2x compared to related approaches while still adhering to SLAs.</dcterms:abstract> <dc:language>eng</dc:language> <dcterms:available rdf:datatype="http://www.w3.org/2001/XMLSchema#dateTime">2023-02-23T08:02:04Z</dcterms:available> <dcterms:isPartOf rdf:resource="https://kops.uni-konstanz.de/server/rdf/resource/123456789/36"/> <dcterms:title>Memory Footprint Reduction of Cloud Databases with Automated Physical Database Design</dcterms:title> <dcterms:issued>2022</dcterms:issued> <bibo:uri rdf:resource="https://kops.uni-konstanz.de/handle/123456789/66195"/> <dc:rights>terms-of-use</dc:rights> </rdf:Description> </rdf:RDF>