Mining fault-tolerant item sets using subset size occurrence distributions
Dateien
Datum
Autor:innen
Herausgeber:innen
ISSN der Zeitschrift
Electronic ISSN
ISBN
Bibliografische Daten
Verlag
Schriftenreihe
Auflagebezeichnung
URI (zitierfähiger Link)
DOI (zitierfähiger Link)
Internationale Patentnummer
Link zur Lizenz
Angaben zur Forschungsförderung
Projekt
Open Access-Veröffentlichung
Sammlungen
Core Facility der Universität Konstanz
Titel in einer weiteren Sprache
Publikationstyp
Publikationsstatus
Erschienen in
Zusammenfassung
Mining fault-tolerant (or approximate or fuzzy) item sets means to allow for errors in the underlying transaction data in the sense that actually present items may not be recorded due to noise or measurement errors. In order to cope with such missing items, transactions that do not contain all items of a given set are still allowed to support it. However, either the number of missing items must be limited, or the transaction's contribution to the item set's support is reduced in proportion to the number of missing items, or both. In this paper we present an algorithm that efficiently computes the subset size occurrence distribution of item sets, evaluates this distribution to find fault-tolerant item sets, and exploits intermediate data to remove pseudo (or spurious) item sets. We demonstrate the usefulness of our algorithm by applying it to a concept detection task on the 2008/2009 Wikipedia Selection for schools.
Zusammenfassung in einer weiteren Sprache
Fachgebiet (DDC)
Schlagwörter
Konferenz
Rezension
Zitieren
ISO 690
BORGELT, Christian, Tobias KÖTTER, 2011. Mining fault-tolerant item sets using subset size occurrence distributions. In: GAMA, João, ed., Elizabeth BRADLEY, ed., Jaakko HOLLMÉN, ed.. Advances in Intelligent Data Analysis X. Berlin, Heidelberg: Springer Berlin Heidelberg, 2011, pp. 43-54. Lecture Notes in Computer Science. 7014. ISBN 978-3-642-24799-6. Available under: doi: 10.1007/978-3-642-24800-9_7BibTex
@inproceedings{Borgelt2011Minin-15342, year={2011}, doi={10.1007/978-3-642-24800-9_7}, title={Mining fault-tolerant item sets using subset size occurrence distributions}, number={7014}, isbn={978-3-642-24799-6}, publisher={Springer Berlin Heidelberg}, address={Berlin, Heidelberg}, series={Lecture Notes in Computer Science}, booktitle={Advances in Intelligent Data Analysis X}, pages={43--54}, editor={Gama, João and Bradley, Elizabeth and Hollmén, Jaakko}, author={Borgelt, Christian and Kötter, Tobias} }
RDF
<rdf:RDF xmlns:dcterms="http://purl.org/dc/terms/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:bibo="http://purl.org/ontology/bibo/" xmlns:dspace="http://digital-repositories.org/ontologies/dspace/0.1.0#" xmlns:foaf="http://xmlns.com/foaf/0.1/" xmlns:void="http://rdfs.org/ns/void#" xmlns:xsd="http://www.w3.org/2001/XMLSchema#" > <rdf:Description rdf:about="https://kops.uni-konstanz.de/server/rdf/resource/123456789/15342"> <dcterms:abstract xml:lang="eng">Mining fault-tolerant (or approximate or fuzzy) item sets means to allow for errors in the underlying transaction data in the sense that actually present items may not be recorded due to noise or measurement errors. In order to cope with such missing items, transactions that do not contain all items of a given set are still allowed to support it. However, either the number of missing items must be limited, or the transaction's contribution to the item set's support is reduced in proportion to the number of missing items, or both. In this paper we present an algorithm that efficiently computes the subset size occurrence distribution of item sets, evaluates this distribution to find fault-tolerant item sets, and exploits intermediate data to remove pseudo (or spurious) item sets. We demonstrate the usefulness of our algorithm by applying it to a concept detection task on the 2008/2009 Wikipedia Selection for schools.</dcterms:abstract> <dc:contributor>Kötter, Tobias</dc:contributor> <dc:language>eng</dc:language> <void:sparqlEndpoint rdf:resource="http://localhost/fuseki/dspace/sparql"/> <dcterms:available rdf:datatype="http://www.w3.org/2001/XMLSchema#dateTime">2012-10-31T23:25:04Z</dcterms:available> <dc:creator>Kötter, Tobias</dc:creator> <dc:contributor>Borgelt, Christian</dc:contributor> <dcterms:title>Mining fault-tolerant item sets using subset size occurrence distributions</dcterms:title> <foaf:homepage rdf:resource="http://localhost:8080/"/> <dcterms:bibliographicCitation>First publ. in: 10th international symposium, IDA 2011, Porto, Portugal, October 29 - 31, 2011; proceedings / João Gama ... (eds.). - Berlin : Springer, 2011. - pp. 43-54. - (Lecture notes in computer science ; 7014). - ISBN 978-3-642-24799-6</dcterms:bibliographicCitation> <dcterms:isPartOf rdf:resource="https://kops.uni-konstanz.de/server/rdf/resource/123456789/36"/> <dcterms:hasPart rdf:resource="https://kops.uni-konstanz.de/bitstream/123456789/15342/2/Borgelt_Final.pdf"/> <dc:creator>Borgelt, Christian</dc:creator> <dspace:isPartOfCollection rdf:resource="https://kops.uni-konstanz.de/server/rdf/resource/123456789/36"/> <bibo:uri rdf:resource="http://kops.uni-konstanz.de/handle/123456789/15342"/> <dcterms:issued>2011</dcterms:issued> <dc:rights>terms-of-use</dc:rights> <dcterms:rights rdf:resource="https://rightsstatements.org/page/InC/1.0/"/> <dc:date rdf:datatype="http://www.w3.org/2001/XMLSchema#dateTime">2011-12-14T08:44:15Z</dc:date> <dspace:hasBitstream rdf:resource="https://kops.uni-konstanz.de/bitstream/123456789/15342/2/Borgelt_Final.pdf"/> </rdf:Description> </rdf:RDF>