KOPS - The Institutional Repository of the University of Konstanz
# Widened Machine Learning with Application to Bayesian Networks

Type of Publication: | Dissertation |

Publication status: | Published |

URI (citable link): | http://nbn-resolving.de/urn:nbn:de:bsz:352-2-io1j65o755xm9 |

Author: | Sampson, Oliver R. |

Year of publication: | 2020 |

Summary: |
Greedy algorithms (also called “Hill Climbing”) are algorithms that are iterative in nature and choose a best solution from a range of possible candidate solutions at each iteration until a stopping criterion is met (usually when there is no improvement seen). As a result, greedy algorithms get stuck at local optima and are unable to improve beyond these optima. Many different types of methods (metaheuristics) have been introduced to overcome these shortcomings, from the mundanely simple (Random Restart) to clever attempts at mimicking nature’s ability to solve problems (Ant Colony Optimization). Widening is a new metaheuristic that uses diversity among parallel workers to improve on greedy algorithms.
Bayesian Networks are probabilistic graphical models used in a variety of applications from medical diagnosis to document classification, among many others. When learning the structure of Bayesian networks from data, the hypothesis space grows superexponentially with the number of features in the dataset, making their hypothesis space an interesting use case for understanding the effects of Widening on greedy learning algorithms from those datasets. This dissertation describes Widening as a metaheuristic and places it in context to other metaheuristics. Widening is inherently parallel and can take advantage of the rapid growth of parallel resources available today. The key element of Widening is its use of diversity to ensure that the parallel workers explore different regions of the hypothesis space. Widening can be realized by explicitly enforcing diversity among the parallel workers at each iterative step, called here Diverse Top-k Widening. It can also be realized by methods where the hypothesis space is partitioned and each parallel worker is restricted to a specific partition, thus enabling the parallel workers to proceed without exchanging information among themselves, called here Communication-free Widening. This dissertation covers one method of Diverse Top-k Widening, where a diversity measure, p-min-sum, is used in conjunction with a distance metric, the Frobenius Norm of the difference of the graphs’ Laplacian. It also covers two different approaches to Communication-free Widening, where in one case two different methods of using an eigenvector from the matrix representations of Bayesian networks (the Fiedler vector from the graph’s Laplacian, and the eigenvector associated with the second largest eigenvalue of the skew-symmetric adjacency matrix) are discussed. A second method of Communication-free Widening is covered using diverse orders, where an order is a strict restriction on what nodes may be parents of other nodes in the Bayesian network. In all three cases, Widening applied to learning the structure of Bayesian networks for use as classifiers was able to outperform the reference algorithms in a majority of the datasets covered in the experiments, indicating that Widening is a metaheuristic ready for broader application to other greedy algorithms. |

Examination date (for dissertations): | Apr 20, 2020 |

Dissertation note: | Doctoral dissertation, University of Konstanz |

Subject (DDC): | 004 Computer Science |

Keywords: | machine learning, bayesian networks, parallel computing |

Link to License: | Attribution-ShareAlike 4.0 International |

Checksum:
MD5:e6d65d241c0c5493069cbb7be0c477cb

SAMPSON, Oliver R., 2020. Widened Machine Learning with Application to Bayesian Networks [Dissertation]. Konstanz: University of Konstanz

@phdthesis{Sampson2020Widen-49257, title={Widened Machine Learning with Application to Bayesian Networks}, year={2020}, author={Sampson, Oliver R.}, address={Konstanz}, school={Universität Konstanz} }

Sampson_2-io1j65o755xm9.pdf | 840 |