LPLM : A Neural Language Model for Cardinality Estimation of LIKE-Queries

Lade...
Vorschaubild
Dateien
Aytimur_2-1vneqrtgjl07c9.pdf
Aytimur_2-1vneqrtgjl07c9.pdfGröße: 1.77 MBDownloads: 9
Datum
2024
Herausgeber:innen
Kontakt
ISSN der Zeitschrift
Electronic ISSN
ISBN
Bibliografische Daten
Verlag
Schriftenreihe
Auflagebezeichnung
DOI (zitierfähiger Link)
ArXiv-ID
Internationale Patentnummer
Angaben zur Forschungsförderung
Deutsche Forschungsgemeinschaft (DFG): CH 2464/1-1
Deutsche Forschungsgemeinschaft (DFG): GR 4497/5
Projekt
Open Access-Veröffentlichung
Open Access Hybrid
Core Facility der Universität Konstanz
Gesperrt bis
Titel in einer weiteren Sprache
Forschungsvorhaben
Organisationseinheiten
Zeitschriftenheft
Publikationstyp
Zeitschriftenartikel
Publikationsstatus
Published
Erschienen in
Proceedings of the ACM on Management of Data. Association for Computing Machinery (ACM). 2024, 2(1), 54. eISSN 2836-6573. Available under: doi: 10.1145/3639309
Zusammenfassung

Cardinality estimation is an important step in cost-based database query optimization. The accuracy of the estimates directly affects the ability of an optimizer to identify the most efficient query execution plan correctly. In this paper, we study cardinality estimation of LIKE-queries, i.e., queries that use the LIKE-operator to match a pattern with wildcards against string-valued attributes. While both traditional and machine-learning-based approaches have been proposed to tackle this problem, we argue that they all suffer from drawbacks. Most importantly, many state-of-the-art approaches are not designed for patterns that contain wildcards in-between characters. Based on past research on neural language models, we introduce the LIKE-Pattern Language Model (LPLM) that uses a new language and a novel probability distribution function to capture the semantics of general LIKE-patterns. We also propose a method to generate training data for our model. We demonstrate that our method outperforms state-of-the-art approaches in terms of precision (Q-error), while offering comparable runtime performance and memory requirements.

Zusammenfassung in einer weiteren Sprache
Fachgebiet (DDC)
004 Informatik
Schlagwörter
Konferenz
Rezension
undefined / . - undefined, undefined
Zitieren
ISO 690AYTIMUR, Mehmet, Silvan REINER, Leonard WÖRTELER, Theodoros CHONDROGIANNIS, Michael GROSSNIKLAUS, 2024. LPLM : A Neural Language Model for Cardinality Estimation of LIKE-Queries. In: Proceedings of the ACM on Management of Data. Association for Computing Machinery (ACM). 2024, 2(1), 54. eISSN 2836-6573. Available under: doi: 10.1145/3639309
BibTex
@article{Aytimur2024Neura-69726,
  year={2024},
  doi={10.1145/3639309},
  title={LPLM : A Neural Language Model for Cardinality Estimation of LIKE-Queries},
  number={1},
  volume={2},
  journal={Proceedings of the ACM on Management of Data},
  author={Aytimur, Mehmet and Reiner, Silvan and Wörteler, Leonard and Chondrogiannis, Theodoros and Grossniklaus, Michael},
  note={Article Number: 54}
}
RDF
<rdf:RDF
    xmlns:dcterms="http://purl.org/dc/terms/"
    xmlns:dc="http://purl.org/dc/elements/1.1/"
    xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
    xmlns:bibo="http://purl.org/ontology/bibo/"
    xmlns:dspace="http://digital-repositories.org/ontologies/dspace/0.1.0#"
    xmlns:foaf="http://xmlns.com/foaf/0.1/"
    xmlns:void="http://rdfs.org/ns/void#"
    xmlns:xsd="http://www.w3.org/2001/XMLSchema#" > 
  <rdf:Description rdf:about="https://kops.uni-konstanz.de/server/rdf/resource/123456789/69726">
    <dcterms:rights rdf:resource="https://rightsstatements.org/page/InC/1.0/"/>
    <dc:contributor>Wörteler, Leonard</dc:contributor>
    <dc:language>eng</dc:language>
    <dcterms:available rdf:datatype="http://www.w3.org/2001/XMLSchema#dateTime">2024-04-02T08:49:13Z</dcterms:available>
    <dcterms:isPartOf rdf:resource="https://kops.uni-konstanz.de/server/rdf/resource/123456789/36"/>
    <dc:creator>Grossniklaus, Michael</dc:creator>
    <dspace:isPartOfCollection rdf:resource="https://kops.uni-konstanz.de/server/rdf/resource/123456789/36"/>
    <dc:creator>Reiner, Silvan</dc:creator>
    <dspace:hasBitstream rdf:resource="https://kops.uni-konstanz.de/bitstream/123456789/69726/1/Aytimur_2-1vneqrtgjl07c9.pdf"/>
    <foaf:homepage rdf:resource="http://localhost:8080/"/>
    <bibo:uri rdf:resource="https://kops.uni-konstanz.de/handle/123456789/69726"/>
    <dcterms:issued>2024</dcterms:issued>
    <dcterms:abstract>Cardinality estimation is an important step in cost-based database query optimization. The accuracy of the estimates directly affects the ability of an optimizer to identify the most efficient query execution plan correctly. In this paper, we study cardinality estimation of LIKE-queries, i.e., queries that use the LIKE-operator to match a pattern with wildcards against string-valued attributes. While both traditional and machine-learning-based approaches have been proposed to tackle this problem, we argue that they all suffer from drawbacks. Most importantly, many state-of-the-art approaches are not designed for patterns that contain wildcards in-between characters. Based on past research on neural language models, we introduce the LIKE-Pattern Language Model (LPLM) that uses a new language and a novel probability distribution function to capture the semantics of general LIKE-patterns. We also propose a method to generate training data for our model. We demonstrate that our method outperforms state-of-the-art approaches in terms of precision (Q-error), while offering comparable runtime performance and memory requirements.</dcterms:abstract>
    <dc:contributor>Chondrogiannis, Theodoros</dc:contributor>
    <dc:rights>terms-of-use</dc:rights>
    <dc:creator>Aytimur, Mehmet</dc:creator>
    <dc:creator>Chondrogiannis, Theodoros</dc:creator>
    <dc:contributor>Grossniklaus, Michael</dc:contributor>
    <dc:creator>Wörteler, Leonard</dc:creator>
    <dc:date rdf:datatype="http://www.w3.org/2001/XMLSchema#dateTime">2024-04-02T08:49:13Z</dc:date>
    <dc:contributor>Aytimur, Mehmet</dc:contributor>
    <void:sparqlEndpoint rdf:resource="http://localhost/fuseki/dspace/sparql"/>
    <dc:contributor>Reiner, Silvan</dc:contributor>
    <dcterms:title>LPLM : A Neural Language Model for Cardinality Estimation of LIKE-Queries</dcterms:title>
    <dcterms:hasPart rdf:resource="https://kops.uni-konstanz.de/bitstream/123456789/69726/1/Aytimur_2-1vneqrtgjl07c9.pdf"/>
  </rdf:Description>
</rdf:RDF>
Interner Vermerk
xmlui.Submission.submit.DescribeStep.inputForms.label.kops_note_fromSubmitter
Kontakt
URL der Originalveröffentl.
Prüfdatum der URL
Prüfungsdatum der Dissertation
Finanzierungsart
Kommentar zur Publikation
Allianzlizenz
Corresponding Authors der Uni Konstanz vorhanden
Internationale Co-Autor:innen
Universitätsbibliographie
Ja
Begutachtet
Unbekannt
Diese Publikation teilen