Publikation:

Widened Machine Learning with Application to Bayesian Networks

Lade...
Vorschaubild

Dateien

Sampson_2-io1j65o755xm9.pdf
Sampson_2-io1j65o755xm9.pdfGröße: 4.9 MBDownloads: 1677

Datum

2020

Herausgeber:innen

Kontakt

ISSN der Zeitschrift

Electronic ISSN

ISBN

Bibliografische Daten

Verlag

Schriftenreihe

Auflagebezeichnung

DOI (zitierfähiger Link)
ArXiv-ID

Internationale Patentnummer

Link zur Lizenz

Angaben zur Forschungsförderung

Projekt

Open Access-Veröffentlichung
Open Access Green
Core Facility der Universität Konstanz

Gesperrt bis

Titel in einer weiteren Sprache

Publikationstyp
Dissertation
Publikationsstatus
Published

Erschienen in

Zusammenfassung

Greedy algorithms (also called “Hill Climbing”) are algorithms that are iterative in nature and choose a best solution from a range of possible candidate solutions at each iteration until a stopping criterion is met (usually when there is no improvement seen). As a result, greedy algorithms get stuck at local optima and are unable to improve beyond these optima. Many different types of methods (metaheuristics) have been introduced to overcome these shortcomings, from the mundanely simple (Random Restart) to clever attempts at mimicking nature’s ability to solve problems (Ant Colony Optimization). Widening is a new metaheuristic that uses diversity among parallel workers to improve on greedy algorithms.

Bayesian Networks are probabilistic graphical models used in a variety of applications from medical diagnosis to document classification, among many others. When learning the structure of Bayesian networks from data, the hypothesis space grows superexponentially with the number of features in the dataset, making their hypothesis space an interesting use case for understanding the effects of Widening on greedy learning algorithms from those datasets.

This dissertation describes Widening as a metaheuristic and places it in context to other metaheuristics. Widening is inherently parallel and can take advantage of the rapid growth of parallel resources available today. The key element of Widening is its use of diversity to ensure that the parallel workers explore different regions of the hypothesis space. Widening can be realized by explicitly enforcing diversity among the parallel workers at each iterative step, called here Diverse Top-k Widening. It can also be realized by methods where the hypothesis space is partitioned and each parallel worker is restricted to a specific partition, thus enabling the parallel workers to proceed without exchanging information among themselves, called here Communication-free Widening.

This dissertation covers one method of Diverse Top-k Widening, where a diversity measure, p-min-sum, is used in conjunction with a distance metric, the Frobenius Norm of the difference of the graphs’ Laplacian. It also covers two different approaches to Communication-free Widening, where in one case two different methods of using an eigenvector from the matrix representations of Bayesian networks (the Fiedler vector from the graph’s Laplacian, and the eigenvector associated with the second largest eigenvalue of the skew-symmetric adjacency matrix) are discussed. A second method of Communication-free Widening is covered using diverse orders, where an order is a strict restriction on what nodes may be parents of other nodes in the Bayesian network.

In all three cases, Widening applied to learning the structure of Bayesian networks for use as classifiers was able to outperform the reference algorithms in a majority of the datasets covered in the experiments, indicating that Widening is a metaheuristic ready for broader application to other greedy algorithms.

Zusammenfassung in einer weiteren Sprache

Fachgebiet (DDC)
004 Informatik

Schlagwörter

machine learning, bayesian networks, parallel computing

Konferenz

Rezension
undefined / . - undefined, undefined

Forschungsvorhaben

Organisationseinheiten

Zeitschriftenheft

Zugehörige Datensätze in KOPS

Zitieren

ISO 690SAMPSON, Oliver R., 2020. Widened Machine Learning with Application to Bayesian Networks [Dissertation]. Konstanz: University of Konstanz
BibTex
@phdthesis{Sampson2020Widen-49257,
  year={2020},
  title={Widened Machine Learning with Application to Bayesian Networks},
  author={Sampson, Oliver R.},
  address={Konstanz},
  school={Universität Konstanz}
}
RDF
<rdf:RDF
    xmlns:dcterms="http://purl.org/dc/terms/"
    xmlns:dc="http://purl.org/dc/elements/1.1/"
    xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
    xmlns:bibo="http://purl.org/ontology/bibo/"
    xmlns:dspace="http://digital-repositories.org/ontologies/dspace/0.1.0#"
    xmlns:foaf="http://xmlns.com/foaf/0.1/"
    xmlns:void="http://rdfs.org/ns/void#"
    xmlns:xsd="http://www.w3.org/2001/XMLSchema#" > 
  <rdf:Description rdf:about="https://kops.uni-konstanz.de/server/rdf/resource/123456789/49257">
    <dspace:hasBitstream rdf:resource="https://kops.uni-konstanz.de/bitstream/123456789/49257/3/Sampson_2-io1j65o755xm9.pdf"/>
    <dc:contributor>Sampson, Oliver R.</dc:contributor>
    <dcterms:rights rdf:resource="http://creativecommons.org/licenses/by-sa/4.0/"/>
    <void:sparqlEndpoint rdf:resource="http://localhost/fuseki/dspace/sparql"/>
    <dcterms:abstract xml:lang="eng">Greedy algorithms (also called “Hill Climbing”) are algorithms that are iterative in nature and choose a best solution from a range of possible candidate solutions at each iteration until a stopping criterion is met (usually when there is no improvement seen). As a result, greedy algorithms get stuck at local optima and are unable to improve beyond these optima. Many different types of methods (metaheuristics) have been introduced to overcome these shortcomings, from the mundanely simple (Random Restart) to clever attempts at mimicking nature’s ability to solve problems (Ant Colony Optimization). Widening is a new metaheuristic that uses diversity among parallel workers to improve on greedy algorithms.&lt;br /&gt;&lt;br /&gt;Bayesian Networks are probabilistic graphical models used in a variety of applications from medical diagnosis to document classification, among many others. When learning the structure of Bayesian networks from data, the hypothesis space grows superexponentially with the number of features in the dataset, making their hypothesis space an interesting use case for understanding the effects of Widening on greedy learning algorithms from those datasets.&lt;br /&gt;&lt;br /&gt;This dissertation describes Widening as a metaheuristic and places it in context to other metaheuristics. Widening is inherently parallel and can take advantage of the rapid growth of parallel resources available today. The key element of Widening is its use of diversity to ensure that the parallel workers explore different regions of the hypothesis space. Widening can be realized by explicitly enforcing diversity among the parallel workers at each iterative step, called here Diverse Top-k Widening. It can also be realized by methods where the hypothesis space is partitioned and each parallel worker is restricted to a specific partition, thus enabling the parallel workers to proceed without exchanging information among themselves, called here Communication-free Widening.&lt;br /&gt;&lt;br /&gt;This dissertation covers one method of Diverse Top-k Widening, where a diversity measure, p-min-sum, is used in conjunction with a distance metric, the Frobenius Norm of the difference of the graphs’ Laplacian. It also covers two different approaches to Communication-free Widening, where in one case two different methods of using an eigenvector from the matrix representations of Bayesian networks (the Fiedler vector from the graph’s Laplacian, and the eigenvector associated with the second largest eigenvalue of the skew-symmetric adjacency matrix) are discussed. A second method of Communication-free Widening is covered using diverse orders, where an order is a strict restriction on what nodes may be parents of other nodes in the Bayesian network.&lt;br /&gt;&lt;br /&gt;In all three cases, Widening applied to learning the structure of Bayesian networks for use as classifiers was able to outperform the reference algorithms in a majority of the datasets covered in the experiments, indicating that Widening is a metaheuristic ready for broader application to other greedy algorithms.</dcterms:abstract>
    <dcterms:hasPart rdf:resource="https://kops.uni-konstanz.de/bitstream/123456789/49257/3/Sampson_2-io1j65o755xm9.pdf"/>
    <foaf:homepage rdf:resource="http://localhost:8080/"/>
    <dcterms:isPartOf rdf:resource="https://kops.uni-konstanz.de/server/rdf/resource/123456789/36"/>
    <bibo:uri rdf:resource="https://kops.uni-konstanz.de/handle/123456789/49257"/>
    <dc:creator>Sampson, Oliver R.</dc:creator>
    <dcterms:available rdf:datatype="http://www.w3.org/2001/XMLSchema#dateTime">2020-04-23T06:47:06Z</dcterms:available>
    <dcterms:issued>2020</dcterms:issued>
    <dc:language>eng</dc:language>
    <dspace:isPartOfCollection rdf:resource="https://kops.uni-konstanz.de/server/rdf/resource/123456789/36"/>
    <dcterms:title>Widened Machine Learning with Application to Bayesian Networks</dcterms:title>
    <dc:rights>Attribution-ShareAlike 4.0 International</dc:rights>
    <dc:date rdf:datatype="http://www.w3.org/2001/XMLSchema#dateTime">2020-04-23T06:47:06Z</dc:date>
  </rdf:Description>
</rdf:RDF>

Interner Vermerk

xmlui.Submission.submit.DescribeStep.inputForms.label.kops_note_fromSubmitter

Kontakt
URL der Originalveröffentl.

Prüfdatum der URL

Prüfungsdatum der Dissertation

April 20, 2020
Hochschulschriftenvermerk
Konstanz, Univ., Diss., 2020
Finanzierungsart

Kommentar zur Publikation

Allianzlizenz
Corresponding Authors der Uni Konstanz vorhanden
Internationale Co-Autor:innen
Universitätsbibliographie
Nein
Begutachtet
Diese Publikation teilen