Widened Machine Learning with Application to Bayesian Networks

Sampson, Oliver R.

Publikation:
Widened Machine Learning with Application to Bayesian Networks

Dateien

Sampson_2-io1j65o755xm9.pdfGröße: 4.9 MBDownloads: 1915

Datum

2020

Autor:innen

Sampson, Oliver R.

URI (zitierfähiger Link)

http://nbn-resolving.de/urn:nbn:de:bsz:352-2-io1j65o755xm9

Link zur Lizenz

CC BY-SA 4.0

Open Access-Veröffentlichung

Open Access Green

Sammlungen

Informatik und Informationswissenschaft: Publikationen

Publikationstyp

Dissertation

Publikationsstatus

Published

Zusammenfassung

Greedy algorithms (also called “Hill Climbing”) are algorithms that are iterative in nature and choose a best solution from a range of possible candidate solutions at each iteration until a stopping criterion is met (usually when there is no improvement seen). As a result, greedy algorithms get stuck at local optima and are unable to improve beyond these optima. Many different types of methods (metaheuristics) have been introduced to overcome these shortcomings, from the mundanely simple (Random Restart) to clever attempts at mimicking nature’s ability to solve problems (Ant Colony Optimization). Widening is a new metaheuristic that uses diversity among parallel workers to improve on greedy algorithms.

Bayesian Networks are probabilistic graphical models used in a variety of applications from medical diagnosis to document classification, among many others. When learning the structure of Bayesian networks from data, the hypothesis space grows superexponentially with the number of features in the dataset, making their hypothesis space an interesting use case for understanding the effects of Widening on greedy learning algorithms from those datasets.

This dissertation describes Widening as a metaheuristic and places it in context to other metaheuristics. Widening is inherently parallel and can take advantage of the rapid growth of parallel resources available today. The key element of Widening is its use of diversity to ensure that the parallel workers explore different regions of the hypothesis space. Widening can be realized by explicitly enforcing diversity among the parallel workers at each iterative step, called here Diverse Top-k Widening. It can also be realized by methods where the hypothesis space is partitioned and each parallel worker is restricted to a specific partition, thus enabling the parallel workers to proceed without exchanging information among themselves, called here Communication-free Widening.

This dissertation covers one method of Diverse Top-k Widening, where a diversity measure, p-min-sum, is used in conjunction with a distance metric, the Frobenius Norm of the difference of the graphs’ Laplacian. It also covers two different approaches to Communication-free Widening, where in one case two different methods of using an eigenvector from the matrix representations of Bayesian networks (the Fiedler vector from the graph’s Laplacian, and the eigenvector associated with the second largest eigenvalue of the skew-symmetric adjacency matrix) are discussed. A second method of Communication-free Widening is covered using diverse orders, where an order is a strict restriction on what nodes may be parents of other nodes in the Bayesian network.

In all three cases, Widening applied to learning the structure of Bayesian networks for use as classifiers was able to outperform the reference algorithms in a majority of the datasets covered in the experiments, indicating that Widening is a metaheuristic ready for broader application to other greedy algorithms.

Fachgebiet (DDC)

004 Informatik

Schlagwörter

machine learning, bayesian networks, parallel computing

Zitieren

ISO 690

SAMPSON, Oliver R., 2020. Widened Machine Learning with Application to Bayesian Networks [Dissertation]. Konstanz: University of Konstanz

BibTex

@phdthesis{Sampson2020Widen-49257,
  year={2020},
  title={Widened Machine Learning with Application to Bayesian Networks},
  author={Sampson, Oliver R.},
  address={Konstanz},
  school={Universität Konstanz}
}

RDF

<rdf:RDF
    xmlns:dcterms="http://purl.org/dc/terms/"
    xmlns:dc="http://purl.org/dc/elements/1.1/"
    xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
    xmlns:bibo="http://purl.org/ontology/bibo/"
    xmlns:dspace="http://digital-repositories.org/ontologies/dspace/0.1.0#"
    xmlns:foaf="http://xmlns.com/foaf/0.1/"
    xmlns:void="http://rdfs.org/ns/void#"
    xmlns:xsd="http://www.w3.org/2001/XMLSchema#" > 
  <rdf:Description rdf:about="https://kops.uni-konstanz.de/server/rdf/resource/123456789/49257">
    <dspace:hasBitstream rdf:resource="https://kops.uni-konstanz.de/bitstream/123456789/49257/3/Sampson_2-io1j65o755xm9.pdf"/>
    <dc:contributor>Sampson, Oliver R.</dc:contributor>
    <dcterms:rights rdf:resource="http://creativecommons.org/licenses/by-sa/4.0/"/>
    <void:sparqlEndpoint rdf:resource="http://localhost/fuseki/dspace/sparql"/>
    <dcterms:abstract xml:lang="eng">Greedy algorithms (also called “Hill Climbing”) are algorithms that are iterative in nature and choose a best solution from a range of possible candidate solutions at each iteration until a stopping criterion is met (usually when there is no improvement seen). As a result, greedy algorithms get stuck at local optima and are unable to improve beyond these optima. Many different types of methods (metaheuristics) have been introduced to overcome these shortcomings, from the mundanely simple (Random Restart) to clever attempts at mimicking nature’s ability to solve problems (Ant Colony Optimization). Widening is a new metaheuristic that uses diversity among parallel workers to improve on greedy algorithms.&lt;br /&gt;&lt;br /&gt;Bayesian Networks are probabilistic graphical models used in a variety of applications from medical diagnosis to document classification, among many others. When learning the structure of Bayesian networks from data, the hypothesis space grows superexponentially with the number of features in the dataset, making their hypothesis space an interesting use case for understanding the effects of Widening on greedy learning algorithms from those datasets.&lt;br /&gt;&lt;br /&gt;This dissertation describes Widening as a metaheuristic and places it in context to other metaheuristics. Widening is inherently parallel and can take advantage of the rapid growth of parallel resources available today. The key element of Widening is its use of diversity to ensure that the parallel workers explore different regions of the hypothesis space. Widening can be realized by explicitly enforcing diversity among the parallel workers at each iterative step, called here Diverse Top-k Widening. It can also be realized by methods where the hypothesis space is partitioned and each parallel worker is restricted to a specific partition, thus enabling the parallel workers to proceed without exchanging information among themselves, called here Communication-free Widening.&lt;br /&gt;&lt;br /&gt;This dissertation covers one method of Diverse Top-k Widening, where a diversity measure, p-min-sum, is used in conjunction with a distance metric, the Frobenius Norm of the difference of the graphs’ Laplacian. It also covers two different approaches to Communication-free Widening, where in one case two different methods of using an eigenvector from the matrix representations of Bayesian networks (the Fiedler vector from the graph’s Laplacian, and the eigenvector associated with the second largest eigenvalue of the skew-symmetric adjacency matrix) are discussed. A second method of Communication-free Widening is covered using diverse orders, where an order is a strict restriction on what nodes may be parents of other nodes in the Bayesian network.&lt;br /&gt;&lt;br /&gt;In all three cases, Widening applied to learning the structure of Bayesian networks for use as classifiers was able to outperform the reference algorithms in a majority of the datasets covered in the experiments, indicating that Widening is a metaheuristic ready for broader application to other greedy algorithms.</dcterms:abstract>
    <dcterms:hasPart rdf:resource="https://kops.uni-konstanz.de/bitstream/123456789/49257/3/Sampson_2-io1j65o755xm9.pdf"/>
    <foaf:homepage rdf:resource="http://localhost:8080/"/>
    <dcterms:isPartOf rdf:resource="https://kops.uni-konstanz.de/server/rdf/resource/123456789/36"/>
    <bibo:uri rdf:resource="https://kops.uni-konstanz.de/handle/123456789/49257"/>
    <dc:creator>Sampson, Oliver R.</dc:creator>
    <dcterms:available rdf:datatype="http://www.w3.org/2001/XMLSchema#dateTime">2020-04-23T06:47:06Z</dcterms:available>
    <dcterms:issued>2020</dcterms:issued>
    <dc:language>eng</dc:language>
    <dspace:isPartOfCollection rdf:resource="https://kops.uni-konstanz.de/server/rdf/resource/123456789/36"/>
    <dcterms:title>Widened Machine Learning with Application to Bayesian Networks</dcterms:title>
    <dc:rights>Attribution-ShareAlike 4.0 International</dc:rights>
    <dc:date rdf:datatype="http://www.w3.org/2001/XMLSchema#dateTime">2020-04-23T06:47:06Z</dc:date>
  </rdf:Description>
</rdf:RDF>

Prüfungsdatum der Dissertation

April 20, 2020

Hochschulschriftenvermerk

Konstanz, Univ., Diss., 2020

Universitätsbibliographie

Nein

Publikation: Widened Machine Learning with Application to Bayesian Networks

Dateien

Datum

Autor:innen

Herausgeber:innen

Kontakt

ISSN der Zeitschrift

item.preview.dc.identifier.eissn

ISBN

Bibliografische Daten

Verlag

Schriftenreihe

Auflagebezeichnung

URI (zitierfähiger Link)

DOI (zitierfähiger Link)

item.preview.dc.identifier.arxiv

Internationale Patentnummer

Link zur Lizenz

Angaben zur Forschungsförderung

Projekt

Open Access-Veröffentlichung

Sammlungen

Core Facility der Universität Konstanz

Gesperrt bis

Titel in einer weiteren Sprache

Publikationstyp

Publikationsstatus

Erschienen in

Zusammenfassung

Zusammenfassung in einer weiteren Sprache

Fachgebiet (DDC)

Schlagwörter

Konferenz

Rezension

Forschungsvorhaben

Organisationseinheiten

Zeitschriftenheft

Zugehörige Datensätze in KOPS

Zitieren

Interner Vermerk

xmlui.Submission.submit.DescribeStep.inputForms.label.kops_note_fromSubmitter

Kontakt

URL der Originalveröffentl.

Prüfdatum der URL

Prüfungsdatum der Dissertation

Hochschulschriftenvermerk

Finanzierungsart

Kommentar zur Publikation

Allianzlizenz

Corresponding Authors der Uni Konstanz vorhanden

Internationale Co-Autor:innen

Universitätsbibliographie

Begutachtet

Diese Publikation teilen

Publikation:
Widened Machine Learning with Application to Bayesian Networks