KOPS - The Institutional Repository of the University of Konstanz

Mixed monolingual homepage finding in 34 languages : the role of language script and search domain

Aufgrund von Vorbereitungen auf eine neue Version von KOPS, können kommenden Montag und Dienstag keine Publikationen eingereicht werden. (Due to preparations for a new version of KOPS, no publications can be submitted next Monday and Tuesday.)

Mixed monolingual homepage finding in 34 languages : the role of language script and search domain

Cite This

Files in this item

Files Size Format View

There are no files associated with this item.

BLANCO, Roi, Christina LIOMA, 2009. Mixed monolingual homepage finding in 34 languages : the role of language script and search domain. In: Information retrieval. 12(3), pp. 324-351. ISSN 1386-4564. eISSN 1573-7659. Available under: doi: 10.1007/s10791-008-9082-8

@article{Blanco2009Mixed-2720, title={Mixed monolingual homepage finding in 34 languages : the role of language script and search domain}, year={2009}, doi={10.1007/s10791-008-9082-8}, number={3}, volume={12}, issn={1386-4564}, journal={Information retrieval}, pages={324--351}, author={Blanco, Roi and Lioma, Christina} }

<rdf:RDF xmlns:dcterms="http://purl.org/dc/terms/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:bibo="http://purl.org/ontology/bibo/" xmlns:dspace="http://digital-repositories.org/ontologies/dspace/0.1.0#" xmlns:foaf="http://xmlns.com/foaf/0.1/" xmlns:void="http://rdfs.org/ns/void#" xmlns:xsd="http://www.w3.org/2001/XMLSchema#" > <rdf:Description rdf:about="https://kops.uni-konstanz.de/rdf/resource/123456789/2720"> <dcterms:bibliographicCitation>Publ. in: Information retrieval 12 (2009), 3, pp. 324-351</dcterms:bibliographicCitation> <dcterms:isPartOf rdf:resource="https://kops.uni-konstanz.de/rdf/resource/123456789/45"/> <bibo:uri rdf:resource="http://kops.uni-konstanz.de/handle/123456789/2720"/> <dc:language>eng</dc:language> <dcterms:abstract xml:lang="eng">The information that is available or sought on the World Wide Web (Web) is increasingly multilingual. Information Retrieval systems, such as the freely available search engines on the Web, need to provide fair and equal access to this information, regardless of the language in which a query is written or where the query is posted from. In this work, we ask two questions: How do existing state of the art search engines deal with languages written in different alphabets (scripts)? Do local language-based search domains actually facilitate access to information? We conduct a thorough study on the effect of multilingual queries for homepage finding, where the aim of the retrieval system is to return only one document, namely the homepage described in the query. We evaluate the effect of multilingual queries in retrieval performance with regard to (i) the alphabet in which the queries are written (e.g., Latin, Russian, Arabic), and (ii) the language domain where the queries are posted (e.g., google.com, google.fr). We query four major freely available search engines with 764 queries in 34 different languages, and look for the correct homepage in the top retrieved results. In order to have fair multilingual experimental settings, we use an ontology that is comparable across languages and also representative of realistic Web searches: football premier leagues in different countries; the official team name represents our query, and the official team homepage represents the document to be retrieved. A series of thorough experiments involving over 10,000 runs, with queries both in their correct and in Latin characters, and also using both global-domain and local-domain searches, reveal that queries issued in the correct script of a language are more likely to be found and ranked in the top 3, while queries in non-Latin script languages which are however issued in Latin script are less likely to be found; also, queries issued to the correct local domain of a search engine, e.g., French queries to yahoo.fr, are likely to have better retrieval performance than queries issued to the global domain of a search engine. To our knowledge, this is the first Webretrieval study that uses such a wide range of languages.</dcterms:abstract> <dcterms:issued>2009</dcterms:issued> <dc:creator>Lioma, Christina</dc:creator> <dc:date rdf:datatype="http://www.w3.org/2001/XMLSchema#dateTime">2011-03-23T09:58:54Z</dc:date> <dc:rights>terms-of-use</dc:rights> <dcterms:rights rdf:resource="https://rightsstatements.org/page/InC/1.0/"/> <foaf:homepage rdf:resource="http://localhost:8080/jspui"/> <dspace:isPartOfCollection rdf:resource="https://kops.uni-konstanz.de/rdf/resource/123456789/45"/> <dcterms:title>Mixed monolingual homepage finding in 34 languages : the role of language script and search domain</dcterms:title> <dc:creator>Blanco, Roi</dc:creator> <dc:contributor>Lioma, Christina</dc:contributor> <void:sparqlEndpoint rdf:resource="http://localhost/fuseki/dspace/sparql"/> <dc:contributor>Blanco, Roi</dc:contributor> <dcterms:available rdf:datatype="http://www.w3.org/2001/XMLSchema#dateTime">2011-03-23T09:58:54Z</dcterms:available> </rdf:Description> </rdf:RDF>

This item appears in the following Collection(s)

Search KOPS


My Account