Memory Footprint Reduction of Cloud Databases with Automated Physical Database Design

Brendle, Michael

Memory Footprint Reduction of Cloud Databases with Automated Physical Database Design

dc.contributor.author	Brendle, Michael
dc.date.accessioned	2023-02-23T08:02:04Z
dc.date.available	2023-02-23T08:02:04Z
dc.date.issued	2022	eng
dc.description.abstract	Enterprises increasingly move their application data into the cloud by employing data management offerings of database-as-a-service providers, who specialize in hosting and managing database instances. While being obligated to certain performance commitments stipulated in service-level agreements (SLAs) with the customer, database-as-a-service providers are incentivized to minimize internal costs to enhance profitability. Since database workloads are often skewed and DRAM constitutes the primary driver of hardware costs, internal costs can be reduced by evicting rarely accessed (cold) data to cheaper storage layers. Hence, only frequently accessed (hot) data should remain in DRAM. As most database management systems employ a buffer manager to load and evict data at page granularity from secondary storage to DRAM and vice versa, keeping only disk pages with hot data in DRAM can lead to substantial cost savings. The physical database schema, however, is usually not defined according to the data access pattern. Therefore, cold data may be stored on the same disk page as hot data. As we will elaborate, polluting the buffer pool with cold data wastes DRAM capacities, which offers a largely untapped cost reduction potential to database-as-a-service providers. In this dissertation, we aim at utilizing the buffer manager more efficiently by recommending a physical database schema in which hot disk pages contain mainly hot data. Our approach is based on the observation that, in most cases, rows of a table are accessed either frequently or rarely according to a value range of a specific column of that table. In order to identify hot and cold data, we introduce a statistics collector that gathers accurate workload statistics with low memory and runtime overhead. By exploiting the collected statistics, our table partitioning advisor proposes a range partitioning layout by grouping rows into partitions belonging to hot- or cold-classified value ranges. As a result, hot range partitions with a high density of hot data stay in DRAM, whereas cold range partitions are evicted to cheaper storage layers. This prevents the pollution of the buffer pool with cold data. In addition, we periodically optimize the physical schema in light of workload changes over time. We propose a forward-looking approach by developing a workload predictor, which forecasts the approximate future workload. The predicted workload is then fed into our table partitioning advisor. Finally, we implement our ideas into a prototype of a commercial database and showcase its applicability by incorporating a real-world database workload. Our experimental evaluation demonstrates a buffer pool size reduction of up to 3.2x compared to related approaches while still adhering to SLAs.	eng
dc.description.version	published	de
dc.identifier.ppn	1837351279
dc.identifier.uri	https://kops.uni-konstanz.de/handle/123456789/66195
dc.language.iso	eng	eng
dc.rights	terms-of-use
dc.rights.uri	https://rightsstatements.org/page/InC/1.0/
dc.subject	automated physical database design, cloud databases, memory footprint reduction, statistics collector, table partitioning advisor, workload predictor	eng
dc.subject.ddc	004
dc.title	Memory Footprint Reduction of Cloud Databases with Automated Physical Database Design	eng
dc.type	DOCTORAL_THESIS	de
dspace.entity.type	Publication
kops.citation.bibtex	@phdthesis{Brendle2022Memor-66195, year={2022}, title={Memory Footprint Reduction of Cloud Databases with Automated Physical Database Design}, author={Brendle, Michael}, address={Konstanz}, school={Universität Konstanz} }
kops.citation.iso690	BRENDLE, Michael, 2022. Memory Footprint Reduction of Cloud Databases with Automated Physical Database Design [Dissertation]. Konstanz: University of Konstanz	deu
kops.citation.iso690	BRENDLE, Michael, 2022. Memory Footprint Reduction of Cloud Databases with Automated Physical Database Design [Dissertation]. Konstanz: University of Konstanz	eng
kops.citation.rdf	<rdf:RDF xmlns:dcterms="http://purl.org/dc/terms/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:bibo="http://purl.org/ontology/bibo/" xmlns:dspace="http://digital-repositories.org/ontologies/dspace/0.1.0#" xmlns:foaf="http://xmlns.com/foaf/0.1/" xmlns:void="http://rdfs.org/ns/void#" xmlns:xsd="http://www.w3.org/2001/XMLSchema#" > <rdf:Description rdf:about="https://kops.uni-konstanz.de/server/rdf/resource/123456789/66195"> <dc:date rdf:datatype="http://www.w3.org/2001/XMLSchema#dateTime">2023-02-23T08:02:04Z</dc:date> <dspace:hasBitstream rdf:resource="https://kops.uni-konstanz.de/bitstream/123456789/66195/3/Brendle_2-274v8wucp6uu2.pdf"/> <foaf:homepage rdf:resource="http://localhost:8080/"/> <dcterms:hasPart rdf:resource="https://kops.uni-konstanz.de/bitstream/123456789/66195/3/Brendle_2-274v8wucp6uu2.pdf"/> <dc:creator>Brendle, Michael</dc:creator> <dcterms:rights rdf:resource="https://rightsstatements.org/page/InC/1.0/"/> <dc:contributor>Brendle, Michael</dc:contributor> <dspace:isPartOfCollection rdf:resource="https://kops.uni-konstanz.de/server/rdf/resource/123456789/36"/> <void:sparqlEndpoint rdf:resource="http://localhost/fuseki/dspace/sparql"/> <dcterms:abstract xml:lang="eng">Enterprises increasingly move their application data into the cloud by employing data management offerings of database-as-a-service providers, who specialize in hosting and managing database instances. While being obligated to certain performance commitments stipulated in service-level agreements (SLAs) with the customer, database-as-a-service providers are incentivized to minimize internal costs to enhance profitability. Since database workloads are often skewed and DRAM constitutes the primary driver of hardware costs, internal costs can be reduced by evicting rarely accessed (cold) data to cheaper storage layers. Hence, only frequently accessed (hot) data should remain in DRAM. As most database management systems employ a buffer manager to load and evict data at page granularity from secondary storage to DRAM and vice versa, keeping only disk pages with hot data in DRAM can lead to substantial cost savings. The physical database schema, however, is usually not defined according to the data access pattern. Therefore, cold data may be stored on the same disk page as hot data. As we will elaborate, polluting the buffer pool with cold data wastes DRAM capacities, which offers a largely untapped cost reduction potential to database-as-a-service providers. In this dissertation, we aim at utilizing the buffer manager more efficiently by recommending a physical database schema in which hot disk pages contain mainly hot data. Our approach is based on the observation that, in most cases, rows of a table are accessed either frequently or rarely according to a value range of a specific column of that table. In order to identify hot and cold data, we introduce a statistics collector that gathers accurate workload statistics with low memory and runtime overhead. By exploiting the collected statistics, our table partitioning advisor proposes a range partitioning layout by grouping rows into partitions belonging to hot- or cold-classified value ranges. As a result, hot range partitions with a high density of hot data stay in DRAM, whereas cold range partitions are evicted to cheaper storage layers. This prevents the pollution of the buffer pool with cold data. In addition, we periodically optimize the physical schema in light of workload changes over time. We propose a forward-looking approach by developing a workload predictor, which forecasts the approximate future workload. The predicted workload is then fed into our table partitioning advisor. Finally, we implement our ideas into a prototype of a commercial database and showcase its applicability by incorporating a real-world database workload. Our experimental evaluation demonstrates a buffer pool size reduction of up to 3.2x compared to related approaches while still adhering to SLAs.</dcterms:abstract> <dc:language>eng</dc:language> <dcterms:available rdf:datatype="http://www.w3.org/2001/XMLSchema#dateTime">2023-02-23T08:02:04Z</dcterms:available> <dcterms:isPartOf rdf:resource="https://kops.uni-konstanz.de/server/rdf/resource/123456789/36"/> <dcterms:title>Memory Footprint Reduction of Cloud Databases with Automated Physical Database Design</dcterms:title> <dcterms:issued>2022</dcterms:issued> <bibo:uri rdf:resource="https://kops.uni-konstanz.de/handle/123456789/66195"/> <dc:rights>terms-of-use</dc:rights> </rdf:Description> </rdf:RDF>
kops.date.examination	2022-07-29	eng
kops.date.yearDegreeGranted	2022
kops.description.openAccess	openaccessgreen
kops.flag.knbibliography	true
kops.identifier.nbn	urn:nbn:de:bsz:352-2-274v8wucp6uu2
relation.isAuthorOfPublication	28fbf58e-0d8b-47be-8c52-376b2d4d918c
relation.isAuthorOfPublication.latestForDiscovery	28fbf58e-0d8b-47be-8c52-376b2d4d918c

Dateien

Originalbündel

Gerade angezeigt 1 - 1 von 1

Name:: Brendle_2-274v8wucp6uu2.pdf
Größe:: 4.78 MB
Format:: Adobe Portable Document Format

Brendle_2-274v8wucp6uu2.pdfGröße: 4.78 MBDownloads: 382

Lizenzbündel

Gerade angezeigt 1 - 1 von 1

Name:: license.txt
Größe:: 3.96 KB
Format:: Item-specific license agreed upon to submission
Beschreibung:

license.txtGröße: 3.96 KBDownloads: 0

Sammlungen

Informatik und Informationswissenschaft: Publikationen