Using Meta-Morph Rules to develop Morphological Analysers : A case study concerning Tamil
Using Meta-Morph Rules to develop Morphological Analysers : A case study concerning Tamil
Date
2019
Authors
Editors
Journal ISSN
Electronic ISSN
ISBN
Bibliographical data
Publisher
Series
URI (citable link)
DOI (citable link)
International patent number
Link to the license
EU project number
Project
Open Access publication
Collections
Title in another language
Publication type
Contribution to a conference collection
Publication status
Published
Published in
Proceedings of the 14th International Conference on Finite-State Methods and Natural Language Processing / Vogler, Heiko; Maletti, Andreas (ed.). - Stroudsburg, PA : ACL, 2019. - pp. 76-86
Abstract
This paper describes a new and larger coverage Finite-State Morphological Analyser (FSM) and Generator for the Dravidian language Tamil. The FSM has been developed in the context of computational grammar engineering, adhering to the standards of the ParGram effort. Tamil is a morphologically rich language and the interaction between linguistic analysis and formal implementation is complex, resulting in a challenging task. In order to allow the development of the FSM to focus more on the linguistic analysis and less on the formal details, we have developed a system of meta-morph(ology) rules along with a script which translates these rules into FSM processable representations. The introduction of meta-morph rules makes it possible for computationally naive linguists to interact with the system and to expand it in future work. We found that the meta-morph rules help to express linguistic generalisations and reduce the manual effort of writing lexical classes for morphological analysis. Our Tamil FSM currently handles mainly the inflectional morphology of 3,300 verb roots and their 260 forms. Further, it also has a lexicon of approximately 100,000 nouns along with a guesser to handle out-of-vocabulary items. Although the Tamil FSM was primarily developed to be part of a computational grammar, it can also be used as a web or stand-alone application for other NLP tasks, as per general ParGram practice.
Summary in another language
Subject (DDC)
400 Philology, Linguistics
Keywords
Conference
14th International Conference on Finite-State Methods and Natural Language Processing, Sep 23, 2019 - Sep 25, 2019, Dresden
Review
undefined / . - undefined, undefined. - (undefined; undefined)
Cite This
ISO 690
SARVESWARAN, Kengatharaiyer, Gihan DIAS, Miriam BUTT, 2019. Using Meta-Morph Rules to develop Morphological Analysers : A case study concerning Tamil. 14th International Conference on Finite-State Methods and Natural Language Processing. Dresden, Sep 23, 2019 - Sep 25, 2019. In: VOGLER, Heiko, ed., Andreas MALETTI, ed.. Proceedings of the 14th International Conference on Finite-State Methods and Natural Language Processing. Stroudsburg, PA:ACL, pp. 76-86. Available under: doi: 10.18653/v1/W19-3111BibTex
@inproceedings{Sarveswaran2019Using-59695, year={2019}, doi={10.18653/v1/W19-3111}, title={Using Meta-Morph Rules to develop Morphological Analysers : A case study concerning Tamil}, publisher={ACL}, address={Stroudsburg, PA}, booktitle={Proceedings of the 14th International Conference on Finite-State Methods and Natural Language Processing}, pages={76--86}, editor={Vogler, Heiko and Maletti, Andreas}, author={Sarveswaran, Kengatharaiyer and Dias, Gihan and Butt, Miriam} }
RDF
<rdf:RDF xmlns:dcterms="http://purl.org/dc/terms/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:bibo="http://purl.org/ontology/bibo/" xmlns:dspace="http://digital-repositories.org/ontologies/dspace/0.1.0#" xmlns:foaf="http://xmlns.com/foaf/0.1/" xmlns:void="http://rdfs.org/ns/void#" xmlns:xsd="http://www.w3.org/2001/XMLSchema#" > <rdf:Description rdf:about="https://kops.uni-konstanz.de/server/rdf/resource/123456789/59695"> <dcterms:title>Using Meta-Morph Rules to develop Morphological Analysers : A case study concerning Tamil</dcterms:title> <dspace:hasBitstream rdf:resource="https://kops.uni-konstanz.de/bitstream/123456789/59695/1/Sarveswaran_2-1cjlssg7n7j0h9.pdf"/> <dc:date rdf:datatype="http://www.w3.org/2001/XMLSchema#dateTime">2023-01-12T13:38:34Z</dc:date> <dcterms:isPartOf rdf:resource="https://kops.uni-konstanz.de/server/rdf/resource/123456789/45"/> <dcterms:issued>2019</dcterms:issued> <dcterms:abstract xml:lang="eng">This paper describes a new and larger coverage Finite-State Morphological Analyser (FSM) and Generator for the Dravidian language Tamil. The FSM has been developed in the context of computational grammar engineering, adhering to the standards of the ParGram effort. Tamil is a morphologically rich language and the interaction between linguistic analysis and formal implementation is complex, resulting in a challenging task. In order to allow the development of the FSM to focus more on the linguistic analysis and less on the formal details, we have developed a system of meta-morph(ology) rules along with a script which translates these rules into FSM processable representations. The introduction of meta-morph rules makes it possible for computationally naive linguists to interact with the system and to expand it in future work. We found that the meta-morph rules help to express linguistic generalisations and reduce the manual effort of writing lexical classes for morphological analysis. Our Tamil FSM currently handles mainly the inflectional morphology of 3,300 verb roots and their 260 forms. Further, it also has a lexicon of approximately 100,000 nouns along with a guesser to handle out-of-vocabulary items. Although the Tamil FSM was primarily developed to be part of a computational grammar, it can also be used as a web or stand-alone application for other NLP tasks, as per general ParGram practice.</dcterms:abstract> <dc:contributor>Butt, Miriam</dc:contributor> <dcterms:available rdf:datatype="http://www.w3.org/2001/XMLSchema#dateTime">2023-01-12T13:38:34Z</dcterms:available> <dc:contributor>Dias, Gihan</dc:contributor> <dcterms:rights rdf:resource="https://rightsstatements.org/page/InC/1.0/"/> <dspace:isPartOfCollection rdf:resource="https://kops.uni-konstanz.de/server/rdf/resource/123456789/45"/> <foaf:homepage rdf:resource="http://localhost:8080/"/> <dc:rights>terms-of-use</dc:rights> <dc:creator>Dias, Gihan</dc:creator> <bibo:uri rdf:resource="https://kops.uni-konstanz.de/handle/123456789/59695"/> <dcterms:hasPart rdf:resource="https://kops.uni-konstanz.de/bitstream/123456789/59695/1/Sarveswaran_2-1cjlssg7n7j0h9.pdf"/> <dc:creator>Butt, Miriam</dc:creator> <dc:contributor>Sarveswaran, Kengatharaiyer</dc:contributor> <dc:creator>Sarveswaran, Kengatharaiyer</dc:creator> <void:sparqlEndpoint rdf:resource="http://localhost/fuseki/dspace/sparql"/> <dc:language>eng</dc:language> </rdf:Description> </rdf:RDF>
Internal note
xmlui.Submission.submit.DescribeStep.inputForms.label.kops_note_fromSubmitter
Examination date of dissertation
Method of financing
Comment on publication
Alliance license
Corresponding Authors der Uni Konstanz vorhanden
International Co-Authors
Bibliography of Konstanz
Yes