TF-IDuF : A Novel Term-Weighting Scheme for User Modeling based on Users’ Personal Document Collections

Thumbnail Image
Date
2017
Authors
Beel, Joeran
Langer, Stefan
Editors
Contact
Journal ISSN
Electronic ISSN
ISBN
Bibliographical data
Publisher
Series
DOI (citable link)
ArXiv-ID
International patent number
Link to the license
EU project number
Project
Open Access publication
Restricted until
Title in another language
Research Projects
Organizational Units
Journal Issue
Publication type
Contribution to a conference collection
Publication status
Published
Published in
Proceedings of the iConference 2017, Wuhan, China, 2017. - Urbana-Champaign : University of Illinois, 2017. - pp. 452-459
Abstract
TF-IDF is one of the most popular term-weighting schemes, and is applied by search engines, recommender systems, and user modeling engines. With regard to user modeling and recommender systems, we see two shortcomings of TF-IDF. First, calculating IDF requires access to the document corpus from which recommendations are made. Such access is not always given in a user-modeling or recommender system. Second, TF-IDF ignores information from a user’s personal document collection, which could – so we hypothesize – enhance the user modeling process. In this paper, we introduce TF-IDuF as a term-weighting scheme that does not require access to the general document corpus and that considers information from the users’ personal document collections. We evaluated the effectiveness of TF-IDuF compared to TF-IDF and TF-Only and found that TF-IDF and TF-IDuF perform similarly (click-through rates (CTR) of 5.09% vs. 5.14%), and both are around 25% more effective than TF-Only (CTR of 4.06%) for recommending research papers. Consequently, we conclude that TF-IDuF could be a promising term-weighting scheme, especially when access to the document corpus for recommendations is not possible, and thus classic IDF cannot be computed. It is also notable that TF-IDuF and TF-IDF are not exclusive, so that both metrics may be combined to a more effective term-weighting scheme.
Summary in another language
Subject (DDC)
004 Computer Science
Keywords
term weighting, user modeling, tf-idf, tf-iduf, recommender systems
Conference
iConference 2017 : March 22-25,2017, Wuhan, China : Effect, Expand, Evolve : Global Collaboration across the Information Community, Mar 22, 2017 - Mar 25, 2017, Wuhan, China
Review
undefined / . - undefined, undefined. - (undefined; undefined)
Cite This
ISO 690BEEL, Joeran, Stefan LANGER, Bela GIPP, 2017. TF-IDuF : A Novel Term-Weighting Scheme for User Modeling based on Users’ Personal Document Collections. iConference 2017 : March 22-25,2017, Wuhan, China : Effect, Expand, Evolve : Global Collaboration across the Information Community. Wuhan, China, Mar 22, 2017 - Mar 25, 2017. In: Proceedings of the iConference 2017, Wuhan, China, 2017. Urbana-Champaign:University of Illinois, pp. 452-459. Available under: doi: 10.9776/17217
BibTex
@inproceedings{Beel2017TFIDu-41879,
  year={2017},
  doi={10.9776/17217},
  title={TF-IDuF : A Novel Term-Weighting Scheme for User Modeling based on Users’ Personal Document Collections},
  url={http://hdl.handle.net/2142/96756},
  publisher={University of Illinois},
  address={Urbana-Champaign},
  booktitle={Proceedings of the iConference 2017, Wuhan, China, 2017},
  pages={452--459},
  author={Beel, Joeran and Langer, Stefan and Gipp, Bela}
}
RDF
<rdf:RDF
    xmlns:dcterms="http://purl.org/dc/terms/"
    xmlns:dc="http://purl.org/dc/elements/1.1/"
    xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
    xmlns:bibo="http://purl.org/ontology/bibo/"
    xmlns:dspace="http://digital-repositories.org/ontologies/dspace/0.1.0#"
    xmlns:foaf="http://xmlns.com/foaf/0.1/"
    xmlns:void="http://rdfs.org/ns/void#"
    xmlns:xsd="http://www.w3.org/2001/XMLSchema#" > 
  <rdf:Description rdf:about="https://kops.uni-konstanz.de/server/rdf/resource/123456789/41879">
    <dc:contributor>Langer, Stefan</dc:contributor>
    <dcterms:issued>2017</dcterms:issued>
    <dcterms:hasPart rdf:resource="https://kops.uni-konstanz.de/bitstream/123456789/41879/1/Beel_2-nd5ei0v2m07d0.pdf"/>
    <dc:creator>Beel, Joeran</dc:creator>
    <dcterms:abstract xml:lang="eng">TF-IDF is one of the most popular term-weighting schemes, and is applied by search engines, recommender systems, and user modeling engines. With regard to user modeling and recommender systems, we see two shortcomings of TF-IDF. First, calculating IDF requires access to the document corpus from which recommendations are made. Such access is not always given in a user-modeling or recommender system. Second, TF-IDF ignores information from a user’s personal document collection, which could – so we hypothesize – enhance the user modeling process. In this paper, we introduce TF-IDuF as a term-weighting scheme that does not require access to the general document corpus and that considers information from the users’ personal document collections. We evaluated the effectiveness of TF-IDuF compared to TF-IDF and TF-Only and found that TF-IDF and TF-IDuF perform similarly (click-through rates (CTR) of 5.09% vs. 5.14%), and both are around 25% more effective than TF-Only (CTR of 4.06%) for recommending research papers. Consequently, we conclude that TF-IDuF could be a promising term-weighting scheme, especially when access to the document corpus for recommendations is not possible, and thus classic IDF cannot be computed. It is also notable that TF-IDuF and TF-IDF are not exclusive, so that both metrics may be combined to a more effective term-weighting scheme.</dcterms:abstract>
    <dcterms:isPartOf rdf:resource="https://kops.uni-konstanz.de/server/rdf/resource/123456789/36"/>
    <dc:rights>terms-of-use</dc:rights>
    <dspace:hasBitstream rdf:resource="https://kops.uni-konstanz.de/bitstream/123456789/41879/1/Beel_2-nd5ei0v2m07d0.pdf"/>
    <dc:contributor>Beel, Joeran</dc:contributor>
    <dc:creator>Gipp, Bela</dc:creator>
    <dc:creator>Langer, Stefan</dc:creator>
    <bibo:uri rdf:resource="https://kops.uni-konstanz.de/handle/123456789/41879"/>
    <dcterms:available rdf:datatype="http://www.w3.org/2001/XMLSchema#dateTime">2018-03-21T11:18:08Z</dcterms:available>
    <void:sparqlEndpoint rdf:resource="http://localhost/fuseki/dspace/sparql"/>
    <dc:language>eng</dc:language>
    <dspace:isPartOfCollection rdf:resource="https://kops.uni-konstanz.de/server/rdf/resource/123456789/36"/>
    <dc:contributor>Gipp, Bela</dc:contributor>
    <dc:date rdf:datatype="http://www.w3.org/2001/XMLSchema#dateTime">2018-03-21T11:18:08Z</dc:date>
    <foaf:homepage rdf:resource="http://localhost:8080/"/>
    <dcterms:title>TF-IDuF : A Novel Term-Weighting Scheme for User Modeling based on Users’ Personal Document Collections</dcterms:title>
    <dcterms:rights rdf:resource="https://rightsstatements.org/page/InC/1.0/"/>
  </rdf:Description>
</rdf:RDF>
Internal note
xmlui.Submission.submit.DescribeStep.inputForms.label.kops_note_fromSubmitter
Contact
URL of original publication
Test date of URL
2018-03-20
Examination date of dissertation
Method of financing
Comment on publication
Alliance license
Corresponding Authors der Uni Konstanz vorhanden
International Co-Authors
Bibliography of Konstanz
Yes
Refereed