A multingual approach to question classification

dc.contributor.authorKalouli, Aikaterini-Lida
dc.contributor.authorKaiser, Katharina
dc.contributor.authorHautli-Janisz, Annette
dc.contributor.authorKaiser, Georg A.
dc.contributor.authorButt, Miriam
dc.date.accessioned2018-10-23T11:55:34Z
dc.date.available2018-10-23T11:55:34Z
dc.date.issued2018eng
dc.description.abstractIn this paper we present the Konstanz Resource of Questions (KRoQ), the first dependency-parsed, parallel multilingual corpus of information-seeking and non information-seeking questions. In creating the corpus, we employ a linguistically motivated rule-based system that uses linguistic cues from one language to help classify and annotate questions across other languages. Our current corpus includes German, French, Spanish and Koine Greek. Based on the linguistically motivated heuristics we identify, a two-step scoring mechanism assigns intra- and inter-language scores to each question. Based on these scores, each question is classified as being either information seeking or non-information seeking. An evaluation shows that this mechanism correctly classifies questions in 79% of the cases. We release our corpus as a basis for further work in the area of question classification. It can be utilized as training and testing data for machine-learning algorithms, as corpus-data for theoretical linguistic questions or as a resource for further rule-based approaches to question identification.eng
dc.description.versionpublishedeng
dc.identifier.ppn512205701
dc.identifier.urihttps://kops.uni-konstanz.de/handle/123456789/43593
dc.language.isoengeng
dc.rightsterms-of-use
dc.rights.urihttps://rightsstatements.org/page/InC/1.0/
dc.subjectQuestion Answering, Multilinguality, Corpus (Creation, Annotation, etc.)eng
dc.subject.ddc400eng
dc.titleA multingual approach to question classificationeng
dc.typeINPROCEEDINGSeng
dspace.entity.typePublication
kops.citation.bibtex
@inproceedings{Kalouli2018multi-43593,
  year={2018},
  title={A multingual approach to question classification},
  url={http://www.lrec-conf.org/proceedings/lrec2018/pdf/13.pdf},
  publisher={ELRA},
  address={Paris},
  booktitle={Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018)},
  pages={2715--2720},
  editor={Calzolari, Nicoletta},
  author={Kalouli, Aikaterini-Lida and Kaiser, Katharina and Hautli-Janisz, Annette and Kaiser, Georg A. and Butt, Miriam},
  note={ISBN: 979-10-95546-00-9}
}
kops.citation.iso690KALOULI, Aikaterini-Lida, Katharina KAISER, Annette HAUTLI-JANISZ, Georg A. KAISER, Miriam BUTT, 2018. A multingual approach to question classification. Eleventh International Conference on Language Resources and Evaluation (LREC 2018). Miyazaki, Japan, 7. Mai 2018 - 12. Mai 2018. In: CALZOLARI, Nicoletta, ed. and others. Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018). Paris: ELRA, 2018, pp. 2715-2720deu
kops.citation.iso690KALOULI, Aikaterini-Lida, Katharina KAISER, Annette HAUTLI-JANISZ, Georg A. KAISER, Miriam BUTT, 2018. A multingual approach to question classification. Eleventh International Conference on Language Resources and Evaluation (LREC 2018). Miyazaki, Japan, May 7, 2018 - May 12, 2018. In: CALZOLARI, Nicoletta, ed. and others. Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018). Paris: ELRA, 2018, pp. 2715-2720eng
kops.citation.rdf
<rdf:RDF
    xmlns:dcterms="http://purl.org/dc/terms/"
    xmlns:dc="http://purl.org/dc/elements/1.1/"
    xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
    xmlns:bibo="http://purl.org/ontology/bibo/"
    xmlns:dspace="http://digital-repositories.org/ontologies/dspace/0.1.0#"
    xmlns:foaf="http://xmlns.com/foaf/0.1/"
    xmlns:void="http://rdfs.org/ns/void#"
    xmlns:xsd="http://www.w3.org/2001/XMLSchema#" > 
  <rdf:Description rdf:about="https://kops.uni-konstanz.de/server/rdf/resource/123456789/43593">
    <dcterms:issued>2018</dcterms:issued>
    <dcterms:hasPart rdf:resource="https://kops.uni-konstanz.de/bitstream/123456789/43593/3/Kalouli_2-96dkhplnq3224.pdf"/>
    <dc:rights>terms-of-use</dc:rights>
    <dc:creator>Kaiser, Katharina</dc:creator>
    <dc:contributor>Kaiser, Katharina</dc:contributor>
    <dc:contributor>Kaiser, Georg A.</dc:contributor>
    <bibo:uri rdf:resource="https://kops.uni-konstanz.de/handle/123456789/43593"/>
    <foaf:homepage rdf:resource="http://localhost:8080/"/>
    <void:sparqlEndpoint rdf:resource="http://localhost/fuseki/dspace/sparql"/>
    <dc:language>eng</dc:language>
    <dcterms:isPartOf rdf:resource="https://kops.uni-konstanz.de/server/rdf/resource/123456789/45"/>
    <dcterms:available rdf:datatype="http://www.w3.org/2001/XMLSchema#dateTime">2018-10-23T11:55:34Z</dcterms:available>
    <dc:contributor>Kalouli, Aikaterini-Lida</dc:contributor>
    <dc:creator>Kalouli, Aikaterini-Lida</dc:creator>
    <dc:creator>Butt, Miriam</dc:creator>
    <dspace:isPartOfCollection rdf:resource="https://kops.uni-konstanz.de/server/rdf/resource/123456789/45"/>
    <dspace:hasBitstream rdf:resource="https://kops.uni-konstanz.de/bitstream/123456789/43593/3/Kalouli_2-96dkhplnq3224.pdf"/>
    <dc:contributor>Hautli-Janisz, Annette</dc:contributor>
    <dc:creator>Hautli-Janisz, Annette</dc:creator>
    <dc:contributor>Butt, Miriam</dc:contributor>
    <dc:creator>Kaiser, Georg A.</dc:creator>
    <dc:date rdf:datatype="http://www.w3.org/2001/XMLSchema#dateTime">2018-10-23T11:55:34Z</dc:date>
    <dcterms:abstract xml:lang="eng">In this paper we present the Konstanz Resource of Questions (KRoQ), the first dependency-parsed, parallel multilingual corpus of information-seeking and non information-seeking questions. In creating the corpus, we employ a linguistically motivated rule-based system that uses linguistic cues from one language to help classify and annotate questions across other languages. Our current corpus includes German, French, Spanish and Koine Greek. Based on the linguistically motivated heuristics we identify, a two-step scoring mechanism assigns intra- and inter-language scores to each question. Based on these scores, each question is classified as being either information seeking or non-information seeking. An evaluation shows that this mechanism correctly classifies questions in 79% of the cases. We release our corpus as a basis for further work in the area of question classification. It can be utilized as training and testing data for machine-learning algorithms, as corpus-data for theoretical linguistic questions or as a resource for further rule-based approaches to question identification.</dcterms:abstract>
    <dcterms:rights rdf:resource="https://rightsstatements.org/page/InC/1.0/"/>
    <dcterms:title>A multingual approach to question classification</dcterms:title>
  </rdf:Description>
</rdf:RDF>
kops.conferencefieldEleventh International Conference on Language Resources and Evaluation (LREC 2018), 7. Mai 2018 - 12. Mai 2018, Miyazaki, Japandeu
kops.date.conferenceEnd2018-05-12eng
kops.date.conferenceStart2018-05-07eng
kops.description.commentISBN: 979-10-95546-00-9eng
kops.description.openAccessopenaccessgreen
kops.flag.knbibliographytrue
kops.identifier.nbnurn:nbn:de:bsz:352-2-96dkhplnq3224
kops.location.conferenceMiyazaki, Japaneng
kops.relation.uniknProjectTitleWortstellungsvariation in wh-Fragen: Evidenz aus dem Romanischen FOR 2111 TP 2 (Biezma)
kops.sourcefieldCALZOLARI, Nicoletta, ed. and others. <i>Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018)</i>. Paris: ELRA, 2018, pp. 2715-2720deu
kops.sourcefield.plainCALZOLARI, Nicoletta, ed. and others. Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018). Paris: ELRA, 2018, pp. 2715-2720deu
kops.sourcefield.plainCALZOLARI, Nicoletta, ed. and others. Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018). Paris: ELRA, 2018, pp. 2715-2720eng
kops.title.conferenceEleventh International Conference on Language Resources and Evaluation (LREC 2018)eng
kops.urlhttp://www.lrec-conf.org/proceedings/lrec2018/pdf/13.pdfeng
kops.urlDate2018-10-23eng
relation.isAuthorOfPublication81637d9e-323d-4f17-bf75-5cfd69ba87b3
relation.isAuthorOfPublicationfd61d332-561d-4401-a7ab-3029d7716bc1
relation.isAuthorOfPublication23234ecf-5310-49dc-844e-dce0635fe8b4
relation.isAuthorOfPublicationd167e11d-18cd-48fb-8b7e-19025f38ee5f
relation.isAuthorOfPublication8bb66e1d-4b9c-4c7a-8ce1-b4007086d236
relation.isAuthorOfPublication.latestForDiscovery81637d9e-323d-4f17-bf75-5cfd69ba87b3
source.bibliographicInfo.fromPage2715eng
source.bibliographicInfo.toPage2720eng
source.contributor.editorCalzolari, Nicoletta
source.flag.etalEditortrueeng
source.publisherELRAeng
source.publisher.locationPariseng
source.titleProceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018)eng

Dateien

Originalbündel

Gerade angezeigt 1 - 1 von 1
Vorschaubild nicht verfügbar
Name:
Kalouli_2-96dkhplnq3224.pdf
Größe:
153.73 KB
Format:
Adobe Portable Document Format
Beschreibung:
Kalouli_2-96dkhplnq3224.pdf
Kalouli_2-96dkhplnq3224.pdfGröße: 153.73 KBDownloads: 382

Lizenzbündel

Gerade angezeigt 1 - 1 von 1
Vorschaubild nicht verfügbar
Name:
license.txt
Größe:
3.88 KB
Format:
Item-specific license agreed upon to submission
Beschreibung:
license.txt
license.txtGröße: 3.88 KBDownloads: 0