Publikation: LPLM : A Neural Language Model for Cardinality Estimation of LIKE-Queries
Dateien
Datum
Herausgeber:innen
ISSN der Zeitschrift
Electronic ISSN
ISBN
Bibliografische Daten
Verlag
Schriftenreihe
Auflagebezeichnung
URI (zitierfähiger Link)
DOI (zitierfähiger Link)
Internationale Patentnummer
Link zur Lizenz
Angaben zur Forschungsförderung
Deutsche Forschungsgemeinschaft (DFG): GR 4497/5
Projekt
Open Access-Veröffentlichung
Core Facility der Universität Konstanz
Titel in einer weiteren Sprache
Publikationstyp
Publikationsstatus
Erschienen in
Zusammenfassung
Cardinality estimation is an important step in cost-based database query optimization. The accuracy of the estimates directly affects the ability of an optimizer to identify the most efficient query execution plan correctly. In this paper, we study cardinality estimation of LIKE-queries, i.e., queries that use the LIKE-operator to match a pattern with wildcards against string-valued attributes. While both traditional and machine-learning-based approaches have been proposed to tackle this problem, we argue that they all suffer from drawbacks. Most importantly, many state-of-the-art approaches are not designed for patterns that contain wildcards in-between characters. Based on past research on neural language models, we introduce the LIKE-Pattern Language Model (LPLM) that uses a new language and a novel probability distribution function to capture the semantics of general LIKE-patterns. We also propose a method to generate training data for our model. We demonstrate that our method outperforms state-of-the-art approaches in terms of precision (Q-error), while offering comparable runtime performance and memory requirements.
Zusammenfassung in einer weiteren Sprache
Fachgebiet (DDC)
Schlagwörter
Konferenz
Rezension
Zitieren
ISO 690
AYTIMUR, Mehmet, Silvan REINER, Leonard WÖRTELER, Theodoros CHONDROGIANNIS, Michael GROSSNIKLAUS, 2024. LPLM : A Neural Language Model for Cardinality Estimation of LIKE-Queries. In: Proceedings of the ACM on Management of Data. Association for Computing Machinery (ACM). 2024, 2(1), 54. eISSN 2836-6573. Verfügbar unter: doi: 10.1145/3639309BibTex
@article{Aytimur2024Neura-69726, year={2024}, doi={10.1145/3639309}, title={LPLM : A Neural Language Model for Cardinality Estimation of LIKE-Queries}, number={1}, volume={2}, journal={Proceedings of the ACM on Management of Data}, author={Aytimur, Mehmet and Reiner, Silvan and Wörteler, Leonard and Chondrogiannis, Theodoros and Grossniklaus, Michael}, note={Article Number: 54} }
RDF
<rdf:RDF xmlns:dcterms="http://purl.org/dc/terms/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:bibo="http://purl.org/ontology/bibo/" xmlns:dspace="http://digital-repositories.org/ontologies/dspace/0.1.0#" xmlns:foaf="http://xmlns.com/foaf/0.1/" xmlns:void="http://rdfs.org/ns/void#" xmlns:xsd="http://www.w3.org/2001/XMLSchema#" > <rdf:Description rdf:about="https://kops.uni-konstanz.de/server/rdf/resource/123456789/69726"> <dcterms:rights rdf:resource="https://rightsstatements.org/page/InC/1.0/"/> <dc:contributor>Wörteler, Leonard</dc:contributor> <dc:language>eng</dc:language> <dcterms:available rdf:datatype="http://www.w3.org/2001/XMLSchema#dateTime">2024-04-02T08:49:13Z</dcterms:available> <dcterms:isPartOf rdf:resource="https://kops.uni-konstanz.de/server/rdf/resource/123456789/36"/> <dc:creator>Grossniklaus, Michael</dc:creator> <dspace:isPartOfCollection rdf:resource="https://kops.uni-konstanz.de/server/rdf/resource/123456789/36"/> <dc:creator>Reiner, Silvan</dc:creator> <dspace:hasBitstream rdf:resource="https://kops.uni-konstanz.de/bitstream/123456789/69726/1/Aytimur_2-1vneqrtgjl07c9.pdf"/> <foaf:homepage rdf:resource="http://localhost:8080/"/> <bibo:uri rdf:resource="https://kops.uni-konstanz.de/handle/123456789/69726"/> <dcterms:issued>2024</dcterms:issued> <dcterms:abstract>Cardinality estimation is an important step in cost-based database query optimization. The accuracy of the estimates directly affects the ability of an optimizer to identify the most efficient query execution plan correctly. In this paper, we study cardinality estimation of LIKE-queries, i.e., queries that use the LIKE-operator to match a pattern with wildcards against string-valued attributes. While both traditional and machine-learning-based approaches have been proposed to tackle this problem, we argue that they all suffer from drawbacks. Most importantly, many state-of-the-art approaches are not designed for patterns that contain wildcards in-between characters. Based on past research on neural language models, we introduce the LIKE-Pattern Language Model (LPLM) that uses a new language and a novel probability distribution function to capture the semantics of general LIKE-patterns. We also propose a method to generate training data for our model. We demonstrate that our method outperforms state-of-the-art approaches in terms of precision (Q-error), while offering comparable runtime performance and memory requirements.</dcterms:abstract> <dc:contributor>Chondrogiannis, Theodoros</dc:contributor> <dc:rights>terms-of-use</dc:rights> <dc:creator>Aytimur, Mehmet</dc:creator> <dc:creator>Chondrogiannis, Theodoros</dc:creator> <dc:contributor>Grossniklaus, Michael</dc:contributor> <dc:creator>Wörteler, Leonard</dc:creator> <dc:date rdf:datatype="http://www.w3.org/2001/XMLSchema#dateTime">2024-04-02T08:49:13Z</dc:date> <dc:contributor>Aytimur, Mehmet</dc:contributor> <void:sparqlEndpoint rdf:resource="http://localhost/fuseki/dspace/sparql"/> <dc:contributor>Reiner, Silvan</dc:contributor> <dcterms:title>LPLM : A Neural Language Model for Cardinality Estimation of LIKE-Queries</dcterms:title> <dcterms:hasPart rdf:resource="https://kops.uni-konstanz.de/bitstream/123456789/69726/1/Aytimur_2-1vneqrtgjl07c9.pdf"/> </rdf:Description> </rdf:RDF>