Part of Speech Based Term Weighting for Information Retrieval
| dc.contributor.author | Lioma, Christina | deu |
| dc.contributor.author | Blanco, Roi | deu |
| dc.date.accessioned | 2011-03-23T09:58:43Z | deu |
| dc.date.available | 2011-03-23T09:58:43Z | deu |
| dc.date.issued | 2009 | |
| dc.description.abstract | Automatic language processing tools typically assign to terms so-called weights' corresponding to the contribution of terms to information content. Traditionally, term weights are computed from lexical statistics, e.g., term frequencies. We propose a new type of term weight that is computed from part of speech (POS) n-gram statistics. The proposed POS-based term weight represents how informative a term is in general, based on the POS contexts' in which it generally occurs in language. We suggest five different computations of POS-based term weights by extending existing statistical approximations of term information measures. We apply these POS-based term weights to information retrieval, by integrating them into the model that matches documents to queries. Experiments with two TREC collections and 300 queries, using TF-IDF & BM25 as baselines, show that integrating our POS-based term weights to retrieval always leads to gains (up to +33.7% from the baseline). Additional experiments with a different retrieval model as baseline (Language Model with Dirichlet priors smoothing) and our best performing POS-based term weight, show retrieval gains always and consistently across the whole smoothing range of the baseline. | eng |
| dc.description.version | published | |
| dc.identifier.citation | Publ. in: Advances in information retrieval: 31th European Conference on IR Research, ECIR 2009, Toulouse, France, April 6 - 9, 2009; proceedings / Mohand Boughanem ... (eds.). (= LNCS ; 5478) Berlin: Springer, 2009, pp. 412-423 | deu |
| dc.identifier.doi | 10.1007/978-3-642-00958-7_37 | |
| dc.identifier.uri | http://kops.uni-konstanz.de/handle/123456789/2664 | |
| dc.language.iso | eng | deu |
| dc.legacy.dateIssued | 2010 | deu |
| dc.rights | terms-of-use | deu |
| dc.rights.uri | https://rightsstatements.org/page/InC/1.0/ | deu |
| dc.subject.ddc | 400 | deu |
| dc.title | Part of Speech Based Term Weighting for Information Retrieval | eng |
| dc.type | INPROCEEDINGS | deu |
| dspace.entity.type | Publication | |
| kops.citation.bibtex | @inproceedings{Lioma2009Speec-2664,
year={2009},
doi={10.1007/978-3-642-00958-7_37},
title={Part of Speech Based Term Weighting for Information Retrieval},
number={5478},
isbn={978-3-642-00957-0},
publisher={Springer},
address={Berlin},
series={Lecture Notes in Computer Science},
booktitle={Advances in Information Retrieval},
pages={412--423},
editor={Boughanem, Mohand and Berrut, Catherine and Mothe, Josiane and Soule-Dupuy, Chantal},
author={Lioma, Christina and Blanco, Roi}
} | |
| kops.citation.iso690 | LIOMA, Christina, Roi BLANCO, 2009. Part of Speech Based Term Weighting for Information Retrieval. In: BOUGHANEM, Mohand, ed., Catherine BERRUT, ed., Josiane MOTHE, ed., Chantal SOULE-DUPUY, ed.. Advances in Information Retrieval. Berlin: Springer, 2009, pp. 412-423. Lecture Notes in Computer Science. 5478. ISBN 978-3-642-00957-0. Available under: doi: 10.1007/978-3-642-00958-7_37 | deu |
| kops.citation.iso690 | LIOMA, Christina, Roi BLANCO, 2009. Part of Speech Based Term Weighting for Information Retrieval. In: BOUGHANEM, Mohand, ed., Catherine BERRUT, ed., Josiane MOTHE, ed., Chantal SOULE-DUPUY, ed.. Advances in Information Retrieval. Berlin: Springer, 2009, pp. 412-423. Lecture Notes in Computer Science. 5478. ISBN 978-3-642-00957-0. Available under: doi: 10.1007/978-3-642-00958-7_37 | eng |
| kops.citation.rdf | <rdf:RDF
xmlns:dcterms="http://purl.org/dc/terms/"
xmlns:dc="http://purl.org/dc/elements/1.1/"
xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
xmlns:bibo="http://purl.org/ontology/bibo/"
xmlns:dspace="http://digital-repositories.org/ontologies/dspace/0.1.0#"
xmlns:foaf="http://xmlns.com/foaf/0.1/"
xmlns:void="http://rdfs.org/ns/void#"
xmlns:xsd="http://www.w3.org/2001/XMLSchema#" >
<rdf:Description rdf:about="https://kops.uni-konstanz.de/server/rdf/resource/123456789/2664">
<dcterms:isPartOf rdf:resource="https://kops.uni-konstanz.de/server/rdf/resource/123456789/45"/>
<dc:contributor>Lioma, Christina</dc:contributor>
<dc:language>eng</dc:language>
<dc:rights>terms-of-use</dc:rights>
<dc:contributor>Blanco, Roi</dc:contributor>
<dcterms:abstract xml:lang="eng">Automatic language processing tools typically assign to terms so-called weights' corresponding to the contribution of terms to information content. Traditionally, term weights are computed from lexical statistics, e.g., term frequencies. We propose a new type of term weight that is computed from part of speech (POS) n-gram statistics. The proposed POS-based term weight represents how informative a term is in general, based on the POS contexts' in which it generally occurs in language. We suggest five different computations of POS-based term weights by extending existing statistical approximations of term information measures. We apply these POS-based term weights to information retrieval, by integrating them into the model that matches documents to queries. Experiments with two TREC collections and 300 queries, using TF-IDF & BM25 as baselines, show that integrating our POS-based term weights to retrieval always leads to gains (up to +33.7% from the baseline). Additional experiments with a different retrieval model as baseline (Language Model with Dirichlet priors smoothing) and our best performing POS-based term weight, show retrieval gains always and consistently across the whole smoothing range of the baseline.</dcterms:abstract>
<dcterms:title>Part of Speech Based Term Weighting for Information Retrieval</dcterms:title>
<dcterms:issued>2009</dcterms:issued>
<dc:creator>Blanco, Roi</dc:creator>
<void:sparqlEndpoint rdf:resource="http://localhost/fuseki/dspace/sparql"/>
<dcterms:bibliographicCitation>Publ. in: Advances in information retrieval: 31th European Conference on IR Research, ECIR 2009, Toulouse, France, April 6 - 9, 2009; proceedings / Mohand Boughanem ... (eds.). (= LNCS ; 5478) Berlin: Springer, 2009, pp. 412-423</dcterms:bibliographicCitation>
<bibo:uri rdf:resource="http://kops.uni-konstanz.de/handle/123456789/2664"/>
<dcterms:rights rdf:resource="https://rightsstatements.org/page/InC/1.0/"/>
<dspace:isPartOfCollection rdf:resource="https://kops.uni-konstanz.de/server/rdf/resource/123456789/45"/>
<dc:creator>Lioma, Christina</dc:creator>
<dc:date rdf:datatype="http://www.w3.org/2001/XMLSchema#dateTime">2011-03-23T09:58:43Z</dc:date>
<foaf:homepage rdf:resource="http://localhost:8080/"/>
<dcterms:available rdf:datatype="http://www.w3.org/2001/XMLSchema#dateTime">2011-03-23T09:58:43Z</dcterms:available>
</rdf:Description>
</rdf:RDF> | |
| kops.identifier.nbn | urn:nbn:de:bsz:352-opus-110483 | deu |
| kops.opus.id | 11048 | deu |
| kops.sourcefield | BOUGHANEM, Mohand, ed., Catherine BERRUT, ed., Josiane MOTHE, ed., Chantal SOULE-DUPUY, ed.. <i>Advances in Information Retrieval</i>. Berlin: Springer, 2009, pp. 412-423. Lecture Notes in Computer Science. 5478. ISBN 978-3-642-00957-0. Available under: doi: 10.1007/978-3-642-00958-7_37 | deu |
| kops.sourcefield.plain | BOUGHANEM, Mohand, ed., Catherine BERRUT, ed., Josiane MOTHE, ed., Chantal SOULE-DUPUY, ed.. Advances in Information Retrieval. Berlin: Springer, 2009, pp. 412-423. Lecture Notes in Computer Science. 5478. ISBN 978-3-642-00957-0. Available under: doi: 10.1007/978-3-642-00958-7_37 | deu |
| kops.sourcefield.plain | BOUGHANEM, Mohand, ed., Catherine BERRUT, ed., Josiane MOTHE, ed., Chantal SOULE-DUPUY, ed.. Advances in Information Retrieval. Berlin: Springer, 2009, pp. 412-423. Lecture Notes in Computer Science. 5478. ISBN 978-3-642-00957-0. Available under: doi: 10.1007/978-3-642-00958-7_37 | eng |
| source.bibliographicInfo.fromPage | 412 | |
| source.bibliographicInfo.seriesNumber | 5478 | |
| source.bibliographicInfo.toPage | 423 | |
| source.contributor.editor | Boughanem, Mohand | |
| source.contributor.editor | Berrut, Catherine | |
| source.contributor.editor | Mothe, Josiane | |
| source.contributor.editor | Soule-Dupuy, Chantal | |
| source.identifier.isbn | 978-3-642-00957-0 | |
| source.publisher | Springer | |
| source.publisher.location | Berlin | |
| source.relation.ispartofseries | Lecture Notes in Computer Science | |
| source.title | Advances in Information Retrieval |