Mining fault-tolerant item sets using subset size occurrence distributions

Thumbnail Image
Date
2011
Editors
Contact
Journal ISSN
Electronic ISSN
ISBN
Bibliographical data
Publisher
Series
URI (citable link)
DOI (citable link)
ArXiv-ID
International patent number
Link to the license
EU project number
Project
BISON, RTD Forschungsprojekt
Open Access publication
Restricted until
Title in another language
Research Projects
Organizational Units
Journal Issue
Publication type
Contribution to a conference collection
Publication status
Published in
Advances in Intelligent Data Analysis X / Gama, João; Bradley, Elizabeth; Hollmén, Jaakko (ed.). - Berlin, Heidelberg : Springer Berlin Heidelberg, 2011. - (Lecture Notes in Computer Science ; 7014). - pp. 43-54. - ISBN 978-3-642-24799-6
Abstract
Mining fault-tolerant (or approximate or fuzzy) item sets means to allow for errors in the underlying transaction data in the sense that actually present items may not be recorded due to noise or measurement errors. In order to cope with such missing items, transactions that do not contain all items of a given set are still allowed to support it. However, either the number of missing items must be limited, or the transaction's contribution to the item set's support is reduced in proportion to the number of missing items, or both. In this paper we present an algorithm that efficiently computes the subset size occurrence distribution of item sets, evaluates this distribution to find fault-tolerant item sets, and exploits intermediate data to remove pseudo (or spurious) item sets. We demonstrate the usefulness of our algorithm by applying it to a concept detection task on the 2008/2009 Wikipedia Selection for schools.
Summary in another language
Subject (DDC)
004 Computer Science
Keywords
Data Mining,Frequent Pattern Mining
Conference
Review
undefined / . - undefined, undefined. - (undefined; undefined)
Cite This
ISO 690BORGELT, Christian, Tobias KÖTTER, 2011. Mining fault-tolerant item sets using subset size occurrence distributions. In: GAMA, João, ed., Elizabeth BRADLEY, ed., Jaakko HOLLMÉN, ed.. Advances in Intelligent Data Analysis X. Berlin, Heidelberg:Springer Berlin Heidelberg, pp. 43-54. ISBN 978-3-642-24799-6. Available under: doi: 10.1007/978-3-642-24800-9_7
BibTex
@inproceedings{Borgelt2011Minin-15342,
  year={2011},
  doi={10.1007/978-3-642-24800-9_7},
  title={Mining fault-tolerant item sets using subset size occurrence distributions},
  number={7014},
  isbn={978-3-642-24799-6},
  publisher={Springer Berlin Heidelberg},
  address={Berlin, Heidelberg},
  series={Lecture Notes in Computer Science},
  booktitle={Advances in Intelligent Data Analysis X},
  pages={43--54},
  editor={Gama, João and Bradley, Elizabeth and Hollmén, Jaakko},
  author={Borgelt, Christian and Kötter, Tobias}
}
RDF
<rdf:RDF
    xmlns:dcterms="http://purl.org/dc/terms/"
    xmlns:dc="http://purl.org/dc/elements/1.1/"
    xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
    xmlns:bibo="http://purl.org/ontology/bibo/"
    xmlns:dspace="http://digital-repositories.org/ontologies/dspace/0.1.0#"
    xmlns:foaf="http://xmlns.com/foaf/0.1/"
    xmlns:void="http://rdfs.org/ns/void#"
    xmlns:xsd="http://www.w3.org/2001/XMLSchema#" > 
  <rdf:Description rdf:about="https://kops.uni-konstanz.de/server/rdf/resource/123456789/15342">
    <dcterms:abstract xml:lang="eng">Mining fault-tolerant (or approximate or fuzzy) item sets means to allow for errors in the underlying transaction data in the sense that actually present items may not be recorded due to noise or measurement errors. In order to cope with such missing items, transactions that do not contain all items of a given set are still allowed to support it. However, either the number of missing items must be limited, or the transaction's contribution to the item set's support is reduced in proportion to the number of missing items, or both. In this paper we present an algorithm that efficiently computes the subset size occurrence distribution of item sets, evaluates this distribution to find fault-tolerant item sets, and exploits intermediate data to remove pseudo (or spurious) item sets. We demonstrate the usefulness of our algorithm by applying it to a concept detection task on the 2008/2009 Wikipedia Selection for schools.</dcterms:abstract>
    <dc:contributor>Kötter, Tobias</dc:contributor>
    <dc:language>eng</dc:language>
    <void:sparqlEndpoint rdf:resource="http://localhost/fuseki/dspace/sparql"/>
    <dcterms:available rdf:datatype="http://www.w3.org/2001/XMLSchema#dateTime">2012-10-31T23:25:04Z</dcterms:available>
    <dc:creator>Kötter, Tobias</dc:creator>
    <dc:contributor>Borgelt, Christian</dc:contributor>
    <dcterms:title>Mining fault-tolerant item sets using subset size occurrence distributions</dcterms:title>
    <foaf:homepage rdf:resource="http://localhost:8080/"/>
    <dcterms:bibliographicCitation>First publ. in: 10th international symposium, IDA 2011, Porto, Portugal, October 29 - 31, 2011; proceedings / João Gama ... (eds.). - Berlin : Springer, 2011. - pp. 43-54. - (Lecture notes in computer science ; 7014). - ISBN 978-3-642-24799-6</dcterms:bibliographicCitation>
    <dcterms:isPartOf rdf:resource="https://kops.uni-konstanz.de/server/rdf/resource/123456789/36"/>
    <dcterms:hasPart rdf:resource="https://kops.uni-konstanz.de/bitstream/123456789/15342/2/Borgelt_Final.pdf"/>
    <dc:creator>Borgelt, Christian</dc:creator>
    <dspace:isPartOfCollection rdf:resource="https://kops.uni-konstanz.de/server/rdf/resource/123456789/36"/>
    <bibo:uri rdf:resource="http://kops.uni-konstanz.de/handle/123456789/15342"/>
    <dcterms:issued>2011</dcterms:issued>
    <dc:rights>terms-of-use</dc:rights>
    <dcterms:rights rdf:resource="https://rightsstatements.org/page/InC/1.0/"/>
    <dc:date rdf:datatype="http://www.w3.org/2001/XMLSchema#dateTime">2011-12-14T08:44:15Z</dc:date>
    <dspace:hasBitstream rdf:resource="https://kops.uni-konstanz.de/bitstream/123456789/15342/2/Borgelt_Final.pdf"/>
  </rdf:Description>
</rdf:RDF>
Internal note
xmlui.Submission.submit.DescribeStep.inputForms.label.kops_note_fromSubmitter
Contact
URL of original publication
Test date of URL
Examination date of dissertation
Method of financing
Comment on publication
Alliance license
Corresponding Authors der Uni Konstanz vorhanden
International Co-Authors
Bibliography of Konstanz
Yes
Refereed