Publikation:

Deciphering Undersegmented Ancient Scripts Using Phonetic Prior

Lade...
Vorschaubild

Dateien

Luo_2-74mf4c553nsp9.pdf
Luo_2-74mf4c553nsp9.pdfGröße: 678.46 KBDownloads: 115

Datum

2021

Autor:innen

Luo, Jiaming
Santus, Enrico
Barzilay, Regina
Cao, Yuan

Herausgeber:innen

Kontakt

ISSN der Zeitschrift

Electronic ISSN

ISBN

Bibliografische Daten

Verlag

Schriftenreihe

Auflagebezeichnung

ArXiv-ID

Internationale Patentnummer

Angaben zur Forschungsförderung

Projekt

Open Access-Veröffentlichung
Open Access Gold
Core Facility der Universität Konstanz

Gesperrt bis

Titel in einer weiteren Sprache

Publikationstyp
Zeitschriftenartikel
Publikationsstatus
Published

Erschienen in

Transactions of the Association for Computational Linguistics. MIT Press. 2021, 9, pp. 69-81. eISSN 2307-387X. Available under: doi: 10.1162/tacl_a_00354

Zusammenfassung

Most undeciphered lost languages exhibit two characteristics that pose significant decipherment challenges: (1) the scripts are not fully segmented into words; (2) the closest known language is not determined. We propose a decipherment model that handles both of these challenges by building on rich linguistic constraints reflecting consistent patterns in historical sound change. We capture the natural phonological geometry by learning character embeddings based on the International Phonetic Alphabet (IPA). The resulting generative framework jointly models word segmentation and cognate alignment, informed by phonological constraints. We evaluate the model on both deciphered languages (Gothic, Ugaritic) and an undeciphered one (Iberian). The experiments show that incorporating phonetic geometry leads to clear and consistent gains. Additionally, we propose a measure for language closeness which correctly identifies related languages for Gothic and Ugaritic. For Iberian, the method does not show strong evidence supporting Basque as a related language, concurring with the favored position by the current scholarship.1

Zusammenfassung in einer weiteren Sprache

Fachgebiet (DDC)
400 Sprachwissenschaft, Linguistik

Schlagwörter

Konferenz

Rezension
undefined / . - undefined, undefined

Forschungsvorhaben

Organisationseinheiten

Zeitschriftenheft

Zugehörige Datensätze in KOPS

Zitieren

ISO 690LUO, Jiaming, Frederik HARTMANN, Enrico SANTUS, Regina BARZILAY, Yuan CAO, 2021. Deciphering Undersegmented Ancient Scripts Using Phonetic Prior. In: Transactions of the Association for Computational Linguistics. MIT Press. 2021, 9, pp. 69-81. eISSN 2307-387X. Available under: doi: 10.1162/tacl_a_00354
BibTex
@article{Luo2021Decip-57257,
  year={2021},
  doi={10.1162/tacl_a_00354},
  title={Deciphering Undersegmented Ancient Scripts Using Phonetic Prior},
  volume={9},
  journal={Transactions of the Association for Computational Linguistics},
  pages={69--81},
  author={Luo, Jiaming and Hartmann, Frederik and Santus, Enrico and Barzilay, Regina and Cao, Yuan}
}
RDF
<rdf:RDF
    xmlns:dcterms="http://purl.org/dc/terms/"
    xmlns:dc="http://purl.org/dc/elements/1.1/"
    xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
    xmlns:bibo="http://purl.org/ontology/bibo/"
    xmlns:dspace="http://digital-repositories.org/ontologies/dspace/0.1.0#"
    xmlns:foaf="http://xmlns.com/foaf/0.1/"
    xmlns:void="http://rdfs.org/ns/void#"
    xmlns:xsd="http://www.w3.org/2001/XMLSchema#" > 
  <rdf:Description rdf:about="https://kops.uni-konstanz.de/server/rdf/resource/123456789/57257">
    <dc:contributor>Luo, Jiaming</dc:contributor>
    <dcterms:hasPart rdf:resource="https://kops.uni-konstanz.de/bitstream/123456789/57257/1/Luo_2-74mf4c553nsp9.pdf"/>
    <dc:creator>Cao, Yuan</dc:creator>
    <dcterms:issued>2021</dcterms:issued>
    <dc:language>eng</dc:language>
    <dc:contributor>Hartmann, Frederik</dc:contributor>
    <dcterms:title>Deciphering Undersegmented Ancient Scripts Using Phonetic Prior</dcterms:title>
    <dspace:isPartOfCollection rdf:resource="https://kops.uni-konstanz.de/server/rdf/resource/123456789/45"/>
    <dcterms:abstract xml:lang="eng">Most undeciphered lost languages exhibit two characteristics that pose significant decipherment challenges: (1) the scripts are not fully segmented into words; (2) the closest known language is not determined. We propose a decipherment model that handles both of these challenges by building on rich linguistic constraints reflecting consistent patterns in historical sound change. We capture the natural phonological geometry by learning character embeddings based on the International Phonetic Alphabet (IPA). The resulting generative framework jointly models word segmentation and cognate alignment, informed by phonological constraints. We evaluate the model on both deciphered languages (Gothic, Ugaritic) and an undeciphered one (Iberian). The experiments show that incorporating phonetic geometry leads to clear and consistent gains. Additionally, we propose a measure for language closeness which correctly identifies related languages for Gothic and Ugaritic. For Iberian, the method does not show strong evidence supporting Basque as a related language, concurring with the favored position by the current scholarship.1</dcterms:abstract>
    <dspace:hasBitstream rdf:resource="https://kops.uni-konstanz.de/bitstream/123456789/57257/1/Luo_2-74mf4c553nsp9.pdf"/>
    <dc:contributor>Barzilay, Regina</dc:contributor>
    <dc:contributor>Cao, Yuan</dc:contributor>
    <foaf:homepage rdf:resource="http://localhost:8080/"/>
    <dc:date rdf:datatype="http://www.w3.org/2001/XMLSchema#dateTime">2022-04-11T14:42:07Z</dc:date>
    <bibo:uri rdf:resource="https://kops.uni-konstanz.de/handle/123456789/57257"/>
    <dc:creator>Barzilay, Regina</dc:creator>
    <dc:rights>terms-of-use</dc:rights>
    <dcterms:rights rdf:resource="https://rightsstatements.org/page/InC/1.0/"/>
    <dc:contributor>Santus, Enrico</dc:contributor>
    <void:sparqlEndpoint rdf:resource="http://localhost/fuseki/dspace/sparql"/>
    <dc:creator>Hartmann, Frederik</dc:creator>
    <dc:creator>Santus, Enrico</dc:creator>
    <dcterms:available rdf:datatype="http://www.w3.org/2001/XMLSchema#dateTime">2022-04-11T14:42:07Z</dcterms:available>
    <dcterms:isPartOf rdf:resource="https://kops.uni-konstanz.de/server/rdf/resource/123456789/45"/>
    <dc:creator>Luo, Jiaming</dc:creator>
  </rdf:Description>
</rdf:RDF>

Interner Vermerk

xmlui.Submission.submit.DescribeStep.inputForms.label.kops_note_fromSubmitter

Kontakt
URL der Originalveröffentl.

Prüfdatum der URL

Prüfungsdatum der Dissertation

Finanzierungsart

Kommentar zur Publikation

Allianzlizenz
Corresponding Authors der Uni Konstanz vorhanden
Internationale Co-Autor:innen
Universitätsbibliographie
Ja
Begutachtet
Ja
Diese Publikation teilen