Publikation:

Using Meta-Morph Rules to develop Morphological Analysers : A case study concerning Tamil

Lade...
Vorschaubild

Dateien

Sarveswaran_2-1cjlssg7n7j0h9.pdf
Sarveswaran_2-1cjlssg7n7j0h9.pdfGröße: 248.97 KBDownloads: 92

Datum

2019

Autor:innen

Sarveswaran, Kengatharaiyer
Dias, Gihan

Herausgeber:innen

Kontakt

ISSN der Zeitschrift

Electronic ISSN

ISBN

Bibliografische Daten

Verlag

Schriftenreihe

Auflagebezeichnung

ArXiv-ID

Internationale Patentnummer

Angaben zur Forschungsförderung

Projekt

Open Access-Veröffentlichung
Open Access Bookpart
Core Facility der Universität Konstanz

Gesperrt bis

Titel in einer weiteren Sprache

Publikationstyp
Beitrag zu einem Konferenzband
Publikationsstatus
Published

Erschienen in

VOGLER, Heiko, ed., Andreas MALETTI, ed.. Proceedings of the 14th International Conference on Finite-State Methods and Natural Language Processing. Stroudsburg, PA: ACL, 2019, pp. 76-86. Available under: doi: 10.18653/v1/W19-3111

Zusammenfassung

This paper describes a new and larger coverage Finite-State Morphological Analyser (FSM) and Generator for the Dravidian language Tamil. The FSM has been developed in the context of computational grammar engineering, adhering to the standards of the ParGram effort. Tamil is a morphologically rich language and the interaction between linguistic analysis and formal implementation is complex, resulting in a challenging task. In order to allow the development of the FSM to focus more on the linguistic analysis and less on the formal details, we have developed a system of meta-morph(ology) rules along with a script which translates these rules into FSM processable representations. The introduction of meta-morph rules makes it possible for computationally naive linguists to interact with the system and to expand it in future work. We found that the meta-morph rules help to express linguistic generalisations and reduce the manual effort of writing lexical classes for morphological analysis. Our Tamil FSM currently handles mainly the inflectional morphology of 3,300 verb roots and their 260 forms. Further, it also has a lexicon of approximately 100,000 nouns along with a guesser to handle out-of-vocabulary items. Although the Tamil FSM was primarily developed to be part of a computational grammar, it can also be used as a web or stand-alone application for other NLP tasks, as per general ParGram practice.

Zusammenfassung in einer weiteren Sprache

Fachgebiet (DDC)
400 Sprachwissenschaft, Linguistik

Schlagwörter

Konferenz

14th International Conference on Finite-State Methods and Natural Language Processing, 23. Sept. 2019 - 25. Sept. 2019, Dresden
Rezension
undefined / . - undefined, undefined

Forschungsvorhaben

Organisationseinheiten

Zeitschriftenheft

Zugehörige Datensätze in KOPS

Zitieren

ISO 690SARVESWARAN, Kengatharaiyer, Gihan DIAS, Miriam BUTT, 2019. Using Meta-Morph Rules to develop Morphological Analysers : A case study concerning Tamil. 14th International Conference on Finite-State Methods and Natural Language Processing. Dresden, 23. Sept. 2019 - 25. Sept. 2019. In: VOGLER, Heiko, ed., Andreas MALETTI, ed.. Proceedings of the 14th International Conference on Finite-State Methods and Natural Language Processing. Stroudsburg, PA: ACL, 2019, pp. 76-86. Available under: doi: 10.18653/v1/W19-3111
BibTex
@inproceedings{Sarveswaran2019Using-59695,
  year={2019},
  doi={10.18653/v1/W19-3111},
  title={Using Meta-Morph Rules to develop Morphological Analysers : A case study concerning Tamil},
  publisher={ACL},
  address={Stroudsburg, PA},
  booktitle={Proceedings of the 14th International Conference on Finite-State Methods and Natural Language Processing},
  pages={76--86},
  editor={Vogler, Heiko and Maletti, Andreas},
  author={Sarveswaran, Kengatharaiyer and Dias, Gihan and Butt, Miriam}
}
RDF
<rdf:RDF
    xmlns:dcterms="http://purl.org/dc/terms/"
    xmlns:dc="http://purl.org/dc/elements/1.1/"
    xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
    xmlns:bibo="http://purl.org/ontology/bibo/"
    xmlns:dspace="http://digital-repositories.org/ontologies/dspace/0.1.0#"
    xmlns:foaf="http://xmlns.com/foaf/0.1/"
    xmlns:void="http://rdfs.org/ns/void#"
    xmlns:xsd="http://www.w3.org/2001/XMLSchema#" > 
  <rdf:Description rdf:about="https://kops.uni-konstanz.de/server/rdf/resource/123456789/59695">
    <dcterms:title>Using Meta-Morph Rules to develop Morphological Analysers : A case study concerning Tamil</dcterms:title>
    <dspace:hasBitstream rdf:resource="https://kops.uni-konstanz.de/bitstream/123456789/59695/1/Sarveswaran_2-1cjlssg7n7j0h9.pdf"/>
    <dc:date rdf:datatype="http://www.w3.org/2001/XMLSchema#dateTime">2023-01-12T13:38:34Z</dc:date>
    <dcterms:isPartOf rdf:resource="https://kops.uni-konstanz.de/server/rdf/resource/123456789/45"/>
    <dcterms:issued>2019</dcterms:issued>
    <dcterms:abstract xml:lang="eng">This paper describes a new and larger coverage Finite-State Morphological Analyser (FSM) and Generator for the Dravidian language Tamil. The FSM has been developed in the context of computational grammar engineering, adhering to the standards of the ParGram effort. Tamil is a morphologically rich language and the interaction between linguistic analysis and formal implementation is complex, resulting in a challenging task. In order to allow the development of the FSM to focus more on the linguistic analysis and less on the formal details, we have developed a system of meta-morph(ology) rules along with a script which translates these rules into FSM processable representations. The introduction of meta-morph rules makes it possible for computationally naive linguists to interact with the system and to expand it in future work. We found that the meta-morph rules help to express linguistic generalisations and reduce the manual effort of writing lexical classes for morphological analysis. Our Tamil FSM currently handles mainly the inflectional morphology of 3,300 verb roots and their 260 forms. Further, it also has a lexicon of approximately 100,000 nouns along with a guesser to handle out-of-vocabulary items. Although the Tamil FSM was primarily developed to be part of a computational grammar, it can also be used as a web or stand-alone application for other NLP tasks, as per general ParGram practice.</dcterms:abstract>
    <dc:contributor>Butt, Miriam</dc:contributor>
    <dcterms:available rdf:datatype="http://www.w3.org/2001/XMLSchema#dateTime">2023-01-12T13:38:34Z</dcterms:available>
    <dc:contributor>Dias, Gihan</dc:contributor>
    <dcterms:rights rdf:resource="https://rightsstatements.org/page/InC/1.0/"/>
    <dspace:isPartOfCollection rdf:resource="https://kops.uni-konstanz.de/server/rdf/resource/123456789/45"/>
    <foaf:homepage rdf:resource="http://localhost:8080/"/>
    <dc:rights>terms-of-use</dc:rights>
    <dc:creator>Dias, Gihan</dc:creator>
    <bibo:uri rdf:resource="https://kops.uni-konstanz.de/handle/123456789/59695"/>
    <dcterms:hasPart rdf:resource="https://kops.uni-konstanz.de/bitstream/123456789/59695/1/Sarveswaran_2-1cjlssg7n7j0h9.pdf"/>
    <dc:creator>Butt, Miriam</dc:creator>
    <dc:contributor>Sarveswaran, Kengatharaiyer</dc:contributor>
    <dc:creator>Sarveswaran, Kengatharaiyer</dc:creator>
    <void:sparqlEndpoint rdf:resource="http://localhost/fuseki/dspace/sparql"/>
    <dc:language>eng</dc:language>
  </rdf:Description>
</rdf:RDF>

Interner Vermerk

xmlui.Submission.submit.DescribeStep.inputForms.label.kops_note_fromSubmitter

Kontakt
URL der Originalveröffentl.

Prüfdatum der URL

Prüfungsdatum der Dissertation

Finanzierungsart

Kommentar zur Publikation

Allianzlizenz
Corresponding Authors der Uni Konstanz vorhanden
Internationale Co-Autor:innen
Universitätsbibliographie
Ja
Begutachtet
Diese Publikation teilen