Publikation: Widened Machine Learning with Application to Bayesian Networks
Dateien
Datum
Autor:innen
Herausgeber:innen
ISSN der Zeitschrift
Electronic ISSN
ISBN
Bibliografische Daten
Verlag
Schriftenreihe
Auflagebezeichnung
URI (zitierfähiger Link)
Internationale Patentnummer
Link zur Lizenz
Angaben zur Forschungsförderung
Projekt
Open Access-Veröffentlichung
Core Facility der Universität Konstanz
Titel in einer weiteren Sprache
Publikationstyp
Publikationsstatus
Erschienen in
Zusammenfassung
Greedy algorithms (also called “Hill Climbing”) are algorithms that are iterative in nature and choose a best solution from a range of possible candidate solutions at each iteration until a stopping criterion is met (usually when there is no improvement seen). As a result, greedy algorithms get stuck at local optima and are unable to improve beyond these optima. Many different types of methods (metaheuristics) have been introduced to overcome these shortcomings, from the mundanely simple (Random Restart) to clever attempts at mimicking nature’s ability to solve problems (Ant Colony Optimization). Widening is a new metaheuristic that uses diversity among parallel workers to improve on greedy algorithms.
Bayesian Networks are probabilistic graphical models used in a variety of applications from medical diagnosis to document classification, among many others. When learning the structure of Bayesian networks from data, the hypothesis space grows superexponentially with the number of features in the dataset, making their hypothesis space an interesting use case for understanding the effects of Widening on greedy learning algorithms from those datasets.
This dissertation describes Widening as a metaheuristic and places it in context to other metaheuristics. Widening is inherently parallel and can take advantage of the rapid growth of parallel resources available today. The key element of Widening is its use of diversity to ensure that the parallel workers explore different regions of the hypothesis space. Widening can be realized by explicitly enforcing diversity among the parallel workers at each iterative step, called here Diverse Top-k Widening. It can also be realized by methods where the hypothesis space is partitioned and each parallel worker is restricted to a specific partition, thus enabling the parallel workers to proceed without exchanging information among themselves, called here Communication-free Widening.
This dissertation covers one method of Diverse Top-k Widening, where a diversity measure, p-min-sum, is used in conjunction with a distance metric, the Frobenius Norm of the difference of the graphs’ Laplacian. It also covers two different approaches to Communication-free Widening, where in one case two different methods of using an eigenvector from the matrix representations of Bayesian networks (the Fiedler vector from the graph’s Laplacian, and the eigenvector associated with the second largest eigenvalue of the skew-symmetric adjacency matrix) are discussed. A second method of Communication-free Widening is covered using diverse orders, where an order is a strict restriction on what nodes may be parents of other nodes in the Bayesian network.
In all three cases, Widening applied to learning the structure of Bayesian networks for use as classifiers was able to outperform the reference algorithms in a majority of the datasets covered in the experiments, indicating that Widening is a metaheuristic ready for broader application to other greedy algorithms.
Zusammenfassung in einer weiteren Sprache
Fachgebiet (DDC)
Schlagwörter
Konferenz
Rezension
Zitieren
ISO 690
SAMPSON, Oliver R., 2020. Widened Machine Learning with Application to Bayesian Networks [Dissertation]. Konstanz: University of KonstanzBibTex
@phdthesis{Sampson2020Widen-49257, year={2020}, title={Widened Machine Learning with Application to Bayesian Networks}, author={Sampson, Oliver R.}, address={Konstanz}, school={Universität Konstanz} }
RDF
<rdf:RDF xmlns:dcterms="http://purl.org/dc/terms/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:bibo="http://purl.org/ontology/bibo/" xmlns:dspace="http://digital-repositories.org/ontologies/dspace/0.1.0#" xmlns:foaf="http://xmlns.com/foaf/0.1/" xmlns:void="http://rdfs.org/ns/void#" xmlns:xsd="http://www.w3.org/2001/XMLSchema#" > <rdf:Description rdf:about="https://kops.uni-konstanz.de/server/rdf/resource/123456789/49257"> <dspace:hasBitstream rdf:resource="https://kops.uni-konstanz.de/bitstream/123456789/49257/3/Sampson_2-io1j65o755xm9.pdf"/> <dc:contributor>Sampson, Oliver R.</dc:contributor> <dcterms:rights rdf:resource="http://creativecommons.org/licenses/by-sa/4.0/"/> <void:sparqlEndpoint rdf:resource="http://localhost/fuseki/dspace/sparql"/> <dcterms:abstract xml:lang="eng">Greedy algorithms (also called “Hill Climbing”) are algorithms that are iterative in nature and choose a best solution from a range of possible candidate solutions at each iteration until a stopping criterion is met (usually when there is no improvement seen). As a result, greedy algorithms get stuck at local optima and are unable to improve beyond these optima. Many different types of methods (metaheuristics) have been introduced to overcome these shortcomings, from the mundanely simple (Random Restart) to clever attempts at mimicking nature’s ability to solve problems (Ant Colony Optimization). Widening is a new metaheuristic that uses diversity among parallel workers to improve on greedy algorithms.<br /><br />Bayesian Networks are probabilistic graphical models used in a variety of applications from medical diagnosis to document classification, among many others. When learning the structure of Bayesian networks from data, the hypothesis space grows superexponentially with the number of features in the dataset, making their hypothesis space an interesting use case for understanding the effects of Widening on greedy learning algorithms from those datasets.<br /><br />This dissertation describes Widening as a metaheuristic and places it in context to other metaheuristics. Widening is inherently parallel and can take advantage of the rapid growth of parallel resources available today. The key element of Widening is its use of diversity to ensure that the parallel workers explore different regions of the hypothesis space. Widening can be realized by explicitly enforcing diversity among the parallel workers at each iterative step, called here Diverse Top-k Widening. It can also be realized by methods where the hypothesis space is partitioned and each parallel worker is restricted to a specific partition, thus enabling the parallel workers to proceed without exchanging information among themselves, called here Communication-free Widening.<br /><br />This dissertation covers one method of Diverse Top-k Widening, where a diversity measure, p-min-sum, is used in conjunction with a distance metric, the Frobenius Norm of the difference of the graphs’ Laplacian. It also covers two different approaches to Communication-free Widening, where in one case two different methods of using an eigenvector from the matrix representations of Bayesian networks (the Fiedler vector from the graph’s Laplacian, and the eigenvector associated with the second largest eigenvalue of the skew-symmetric adjacency matrix) are discussed. A second method of Communication-free Widening is covered using diverse orders, where an order is a strict restriction on what nodes may be parents of other nodes in the Bayesian network.<br /><br />In all three cases, Widening applied to learning the structure of Bayesian networks for use as classifiers was able to outperform the reference algorithms in a majority of the datasets covered in the experiments, indicating that Widening is a metaheuristic ready for broader application to other greedy algorithms.</dcterms:abstract> <dcterms:hasPart rdf:resource="https://kops.uni-konstanz.de/bitstream/123456789/49257/3/Sampson_2-io1j65o755xm9.pdf"/> <foaf:homepage rdf:resource="http://localhost:8080/"/> <dcterms:isPartOf rdf:resource="https://kops.uni-konstanz.de/server/rdf/resource/123456789/36"/> <bibo:uri rdf:resource="https://kops.uni-konstanz.de/handle/123456789/49257"/> <dc:creator>Sampson, Oliver R.</dc:creator> <dcterms:available rdf:datatype="http://www.w3.org/2001/XMLSchema#dateTime">2020-04-23T06:47:06Z</dcterms:available> <dcterms:issued>2020</dcterms:issued> <dc:language>eng</dc:language> <dspace:isPartOfCollection rdf:resource="https://kops.uni-konstanz.de/server/rdf/resource/123456789/36"/> <dcterms:title>Widened Machine Learning with Application to Bayesian Networks</dcterms:title> <dc:rights>Attribution-ShareAlike 4.0 International</dc:rights> <dc:date rdf:datatype="http://www.w3.org/2001/XMLSchema#dateTime">2020-04-23T06:47:06Z</dc:date> </rdf:Description> </rdf:RDF>