Publikation: Using Meta-Morph Rules to develop Morphological Analysers : A case study concerning Tamil
Dateien
Datum
Autor:innen
Herausgeber:innen
ISSN der Zeitschrift
Electronic ISSN
ISBN
Bibliografische Daten
Verlag
Schriftenreihe
Auflagebezeichnung
URI (zitierfähiger Link)
DOI (zitierfähiger Link)
Internationale Patentnummer
Link zur Lizenz
Angaben zur Forschungsförderung
Projekt
Open Access-Veröffentlichung
Sammlungen
Core Facility der Universität Konstanz
Titel in einer weiteren Sprache
Publikationstyp
Publikationsstatus
Erschienen in
Zusammenfassung
This paper describes a new and larger coverage Finite-State Morphological Analyser (FSM) and Generator for the Dravidian language Tamil. The FSM has been developed in the context of computational grammar engineering, adhering to the standards of the ParGram effort. Tamil is a morphologically rich language and the interaction between linguistic analysis and formal implementation is complex, resulting in a challenging task. In order to allow the development of the FSM to focus more on the linguistic analysis and less on the formal details, we have developed a system of meta-morph(ology) rules along with a script which translates these rules into FSM processable representations. The introduction of meta-morph rules makes it possible for computationally naive linguists to interact with the system and to expand it in future work. We found that the meta-morph rules help to express linguistic generalisations and reduce the manual effort of writing lexical classes for morphological analysis. Our Tamil FSM currently handles mainly the inflectional morphology of 3,300 verb roots and their 260 forms. Further, it also has a lexicon of approximately 100,000 nouns along with a guesser to handle out-of-vocabulary items. Although the Tamil FSM was primarily developed to be part of a computational grammar, it can also be used as a web or stand-alone application for other NLP tasks, as per general ParGram practice.
Zusammenfassung in einer weiteren Sprache
Fachgebiet (DDC)
Schlagwörter
Konferenz
Rezension
Zitieren
ISO 690
SARVESWARAN, Kengatharaiyer, Gihan DIAS, Miriam BUTT, 2019. Using Meta-Morph Rules to develop Morphological Analysers : A case study concerning Tamil. 14th International Conference on Finite-State Methods and Natural Language Processing. Dresden, 23. Sept. 2019 - 25. Sept. 2019. In: VOGLER, Heiko, ed., Andreas MALETTI, ed.. Proceedings of the 14th International Conference on Finite-State Methods and Natural Language Processing. Stroudsburg, PA: ACL, 2019, pp. 76-86. Available under: doi: 10.18653/v1/W19-3111BibTex
@inproceedings{Sarveswaran2019Using-59695, year={2019}, doi={10.18653/v1/W19-3111}, title={Using Meta-Morph Rules to develop Morphological Analysers : A case study concerning Tamil}, publisher={ACL}, address={Stroudsburg, PA}, booktitle={Proceedings of the 14th International Conference on Finite-State Methods and Natural Language Processing}, pages={76--86}, editor={Vogler, Heiko and Maletti, Andreas}, author={Sarveswaran, Kengatharaiyer and Dias, Gihan and Butt, Miriam} }
RDF
<rdf:RDF xmlns:dcterms="http://purl.org/dc/terms/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:bibo="http://purl.org/ontology/bibo/" xmlns:dspace="http://digital-repositories.org/ontologies/dspace/0.1.0#" xmlns:foaf="http://xmlns.com/foaf/0.1/" xmlns:void="http://rdfs.org/ns/void#" xmlns:xsd="http://www.w3.org/2001/XMLSchema#" > <rdf:Description rdf:about="https://kops.uni-konstanz.de/server/rdf/resource/123456789/59695"> <dcterms:title>Using Meta-Morph Rules to develop Morphological Analysers : A case study concerning Tamil</dcterms:title> <dspace:hasBitstream rdf:resource="https://kops.uni-konstanz.de/bitstream/123456789/59695/1/Sarveswaran_2-1cjlssg7n7j0h9.pdf"/> <dc:date rdf:datatype="http://www.w3.org/2001/XMLSchema#dateTime">2023-01-12T13:38:34Z</dc:date> <dcterms:isPartOf rdf:resource="https://kops.uni-konstanz.de/server/rdf/resource/123456789/45"/> <dcterms:issued>2019</dcterms:issued> <dcterms:abstract xml:lang="eng">This paper describes a new and larger coverage Finite-State Morphological Analyser (FSM) and Generator for the Dravidian language Tamil. The FSM has been developed in the context of computational grammar engineering, adhering to the standards of the ParGram effort. Tamil is a morphologically rich language and the interaction between linguistic analysis and formal implementation is complex, resulting in a challenging task. In order to allow the development of the FSM to focus more on the linguistic analysis and less on the formal details, we have developed a system of meta-morph(ology) rules along with a script which translates these rules into FSM processable representations. The introduction of meta-morph rules makes it possible for computationally naive linguists to interact with the system and to expand it in future work. We found that the meta-morph rules help to express linguistic generalisations and reduce the manual effort of writing lexical classes for morphological analysis. Our Tamil FSM currently handles mainly the inflectional morphology of 3,300 verb roots and their 260 forms. Further, it also has a lexicon of approximately 100,000 nouns along with a guesser to handle out-of-vocabulary items. Although the Tamil FSM was primarily developed to be part of a computational grammar, it can also be used as a web or stand-alone application for other NLP tasks, as per general ParGram practice.</dcterms:abstract> <dc:contributor>Butt, Miriam</dc:contributor> <dcterms:available rdf:datatype="http://www.w3.org/2001/XMLSchema#dateTime">2023-01-12T13:38:34Z</dcterms:available> <dc:contributor>Dias, Gihan</dc:contributor> <dcterms:rights rdf:resource="https://rightsstatements.org/page/InC/1.0/"/> <dspace:isPartOfCollection rdf:resource="https://kops.uni-konstanz.de/server/rdf/resource/123456789/45"/> <foaf:homepage rdf:resource="http://localhost:8080/"/> <dc:rights>terms-of-use</dc:rights> <dc:creator>Dias, Gihan</dc:creator> <bibo:uri rdf:resource="https://kops.uni-konstanz.de/handle/123456789/59695"/> <dcterms:hasPart rdf:resource="https://kops.uni-konstanz.de/bitstream/123456789/59695/1/Sarveswaran_2-1cjlssg7n7j0h9.pdf"/> <dc:creator>Butt, Miriam</dc:creator> <dc:contributor>Sarveswaran, Kengatharaiyer</dc:contributor> <dc:creator>Sarveswaran, Kengatharaiyer</dc:creator> <void:sparqlEndpoint rdf:resource="http://localhost/fuseki/dspace/sparql"/> <dc:language>eng</dc:language> </rdf:Description> </rdf:RDF>