Publikation:

The Induction of Phonological Structure

Lade...
Vorschaubild

Dateien

Mayer_262292.pdf
Mayer_262292.pdfGröße: 15.11 MBDownloads: 334

Datum

2012

Autor:innen

Herausgeber:innen

Kontakt

ISSN der Zeitschrift

Electronic ISSN

ISBN

Bibliografische Daten

Verlag

Schriftenreihe

Auflagebezeichnung

DOI (zitierfähiger Link)
ArXiv-ID

Internationale Patentnummer

Angaben zur Forschungsförderung

Projekt

Open Access-Veröffentlichung
Open Access Green
Core Facility der Universität Konstanz

Gesperrt bis

Titel in einer weiteren Sprache

Publikationstyp
Dissertation
Publikationsstatus
Published

Erschienen in

Zusammenfassung

This dissertation explores to what extent phonological structure can be inferred from the distribution of sounds within words. For this purpose, a typologically oriented computational approach is pursued, which rests on techniques from the fields of computational linguistics, data mining and visual analytics. The methods that are presented are considered to be procedural universals which can be applied to any natural language in the same way even though they yield different results for individual languages.
The basic assumption that underlies all methods is that the co-occurrence of sounds in relevant contexts within words of a language is constrained. The restrictions of combinations of sounds lead to a given distribution, which in turn can be used to induce a distinction in the sounds of the language that can be related to natural classes and features in phonological theory. The focus of the present approach is not so much on the statistical methods that are necessary to induce the latent structures, but on the linguistically motivated contexts which manifest the existing constraints most clearly.


The induction of phonological structure from language data is an interesting research topic for various reasons. First of all, it is remarkable that phonological features, which are mostly defined in terms of articulatory or acoustic properties, are also reflected in the distribution of sounds in a language. In this thesis, I complement previous work on learning phonological categories (e.g., Ellison 1994; Goldsmith and Xanthos 2009) with an approach to infer place of articulation distinctions in consonants. The method is based on the principle of similar place avoidance (SPA; Pozdniakov and Segerer 2007), which states that consonants in CVC sequences tend to exhibit different place features. I contribute to earlier work in this research area by showing that this principle is not only active in Semitic languages (with a study of Maltese verbal roots) but also holds for West Germanic languages (with an investigation of the entries in the CELEX database for English, German and Dutch) and a worldwide sample of word forms from the ASJP dataset (Dryer test for universality), leading to the conclusion that it is a statistical universal. Using this principle to infer place distinctions in consonants yields almost perfect results for the ASJP data and the list of Maltese verbal roots. The automatically generated dendrograms closely correspond to the hierarchical structures for natural classes that have been postulated in the phonological literature (e.g., Rice 1994; McCarthy 1994).


In addition, the present thesis complements previous work on the machine learning of phonological structure with a novel method to automatically discriminate vowels and consonants in a language that is not based on N-gram statistics. The substitution approach relies on the frequency of sounds to occur as the discriminating segments in minimal pairs. Although the method does not achieve the same level of accuracy as earlier approaches in this area (e.g., Sukhotin 1962; Ellison 1994; Goldsmith and Xanthos 2009; Kim and Snyder 2013), it shows that a distinction of vowels and consonants can also be inferred from the relation of sounds in absentia.


Second, the induction of phonological structure is considered in the present work as a way to explore a large amount of language data in search for the presence of phonotactic constraints. To this end, I present a visual analytics approach for the detection of vowel harmony patterns that is intended as a proof of concept that a graphically enhanced statistical analysis can make potentially interesting patterns in the data more accessible to human perception. As the matrix visualizations show, languages exhibiting patterns of vowel harmony (or similar phenomena) can be distinguished from languages without such constraints at a glance. The visualization approach can easily be extended to other related phenomena, e.g., consonant harmony (Hansson 2010), synharmonism (Trubetzkoy 1939 [1967]) or any kind of (statistical) phonotactic constraints. The statistical measure on which the vowel harmony visualizations are based can also serve as a typological measure on the basis of which languages can be compared. The ranking of languages according to this measure approximately reflects the intuition about which languages show conspicuous harmony patterns.

Zusammenfassung in einer weiteren Sprache

Fachgebiet (DDC)
400 Sprachwissenschaft, Linguistik

Schlagwörter

Konferenz

Rezension
undefined / . - undefined, undefined

Forschungsvorhaben

Organisationseinheiten

Zeitschriftenheft

Zugehörige Datensätze in KOPS

Zitieren

ISO 690MAYER, Thomas, 2012. The Induction of Phonological Structure [Dissertation]. Konstanz: University of Konstanz
BibTex
@phdthesis{Mayer2012Induc-26229,
  year={2012},
  title={The Induction of Phonological Structure},
  author={Mayer, Thomas},
  address={Konstanz},
  school={Universität Konstanz}
}
RDF
<rdf:RDF
    xmlns:dcterms="http://purl.org/dc/terms/"
    xmlns:dc="http://purl.org/dc/elements/1.1/"
    xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
    xmlns:bibo="http://purl.org/ontology/bibo/"
    xmlns:dspace="http://digital-repositories.org/ontologies/dspace/0.1.0#"
    xmlns:foaf="http://xmlns.com/foaf/0.1/"
    xmlns:void="http://rdfs.org/ns/void#"
    xmlns:xsd="http://www.w3.org/2001/XMLSchema#" > 
  <rdf:Description rdf:about="https://kops.uni-konstanz.de/server/rdf/resource/123456789/26229">
    <dspace:isPartOfCollection rdf:resource="https://kops.uni-konstanz.de/server/rdf/resource/123456789/45"/>
    <dc:contributor>Mayer, Thomas</dc:contributor>
    <void:sparqlEndpoint rdf:resource="http://localhost/fuseki/dspace/sparql"/>
    <dcterms:title>The Induction of Phonological Structure</dcterms:title>
    <bibo:uri rdf:resource="http://kops.uni-konstanz.de/handle/123456789/26229"/>
    <dcterms:hasPart rdf:resource="https://kops.uni-konstanz.de/bitstream/123456789/26229/1/Mayer_262292.pdf"/>
    <dcterms:issued>2012</dcterms:issued>
    <dcterms:rights rdf:resource="https://rightsstatements.org/page/InC/1.0/"/>
    <dcterms:available rdf:datatype="http://www.w3.org/2001/XMLSchema#dateTime">2014-02-05T06:58:21Z</dcterms:available>
    <dcterms:abstract xml:lang="eng">This dissertation explores to what extent phonological structure can be inferred from the distribution of sounds within words. For this purpose, a typologically oriented computational approach is pursued, which rests on techniques from the fields of computational linguistics, data mining and visual analytics. The methods that are presented are considered to be procedural universals which can be applied to any natural language in the same way even though they yield different results for individual languages.&lt;br /&gt;The basic assumption that underlies all methods is that the co-occurrence of sounds in relevant contexts within words of a language is constrained. The restrictions of combinations of sounds lead to a given distribution, which in turn can be used to induce a distinction in the sounds of the language that can be related to natural classes and features in phonological theory. The focus of the present approach is not so much on the statistical methods that are necessary to induce the latent structures, but on the linguistically motivated contexts which manifest the existing constraints most clearly.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;The induction of phonological structure from language data is an interesting research topic for various reasons. First of all, it is remarkable that phonological features, which are mostly defined in terms of articulatory or acoustic properties, are also reflected in the distribution of sounds in a language. In this thesis, I complement previous work on learning phonological categories (e.g., Ellison 1994; Goldsmith and Xanthos 2009) with an approach to infer place of articulation distinctions in consonants. The method is based on the principle of similar place avoidance (SPA; Pozdniakov and Segerer 2007), which states that consonants in CVC sequences tend to exhibit different place features. I contribute to earlier work in this research area by showing that this principle is not only active in Semitic languages (with a study of Maltese verbal roots) but also holds for West Germanic languages (with an investigation of the entries in the CELEX database for English, German and Dutch) and a worldwide sample of word forms from the ASJP dataset (Dryer test for universality), leading to the conclusion that it is a statistical universal. Using this principle to infer place distinctions in consonants yields almost perfect results for the ASJP data and the list of Maltese verbal roots. The automatically generated dendrograms closely correspond to the hierarchical structures for natural classes that have been postulated in the phonological literature (e.g., Rice 1994; McCarthy 1994).&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;In addition, the present thesis complements previous work on the machine learning of phonological structure with a novel method to automatically discriminate vowels and consonants in a language that is not based on N-gram statistics. The substitution approach relies on the frequency of sounds to occur as the discriminating segments in minimal pairs. Although the method does not achieve the same level of accuracy as earlier approaches in this area (e.g., Sukhotin 1962; Ellison 1994; Goldsmith and Xanthos 2009; Kim and Snyder 2013), it shows that a distinction of vowels and consonants can also be inferred from the relation of sounds in absentia.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;Second, the induction of phonological structure is considered in the present work as a way to explore a large amount of language data in search for the presence of phonotactic constraints. To this end, I present a visual analytics approach for the detection of vowel harmony patterns that is intended as a proof of concept that a graphically enhanced statistical analysis can make potentially interesting patterns in the data more accessible to human perception. As the matrix visualizations show, languages exhibiting patterns of vowel harmony (or similar phenomena) can be distinguished from languages without such constraints at a glance. The visualization approach can easily be extended to other related phenomena, e.g., consonant harmony (Hansson 2010), synharmonism (Trubetzkoy 1939 [1967]) or any kind of (statistical) phonotactic constraints. The statistical measure on which the vowel harmony visualizations are based can also serve as a typological measure on the basis of which languages can be compared. The ranking of languages according to this measure approximately reflects the intuition about which languages show conspicuous harmony patterns.</dcterms:abstract>
    <foaf:homepage rdf:resource="http://localhost:8080/"/>
    <dspace:hasBitstream rdf:resource="https://kops.uni-konstanz.de/bitstream/123456789/26229/1/Mayer_262292.pdf"/>
    <dcterms:isPartOf rdf:resource="https://kops.uni-konstanz.de/server/rdf/resource/123456789/45"/>
    <dc:language>eng</dc:language>
    <dc:rights>terms-of-use</dc:rights>
    <dc:creator>Mayer, Thomas</dc:creator>
    <dc:date rdf:datatype="http://www.w3.org/2001/XMLSchema#dateTime">2014-02-05T06:58:21Z</dc:date>
  </rdf:Description>
</rdf:RDF>

Interner Vermerk

xmlui.Submission.submit.DescribeStep.inputForms.label.kops_note_fromSubmitter

Kontakt
URL der Originalveröffentl.

Prüfdatum der URL

Prüfungsdatum der Dissertation

February 9, 2012
Finanzierungsart

Kommentar zur Publikation

Allianzlizenz
Corresponding Authors der Uni Konstanz vorhanden
Internationale Co-Autor:innen
Universitätsbibliographie
Begutachtet
Diese Publikation teilen