Reducing computational effort for plagiarism detection by using citation characteristics to limit retrieval space
| dc.contributor.author | Meuschke, Norman | |
| dc.contributor.author | Gipp, Bela | |
| dc.date.accessioned | 2015-03-16T15:48:08Z | |
| dc.date.available | 2015-03-16T15:48:08Z | |
| dc.date.issued | 2014 | eng |
| dc.description.abstract | This paper proposes a hybrid approach to plagiarism detection in academic documents that integrates detection methods using citations, semantic argument structure, and semantic word similarity with character-based methods to achieve a higher detection performance for disguised plagiarism forms. Currently available software for plagiarism detection exclusively performs text string comparisons. These systems find copies, but fail to identify disguised plagiarism, such as paraphrases, translations, or idea plagiarism. Detection approaches that consider semantic similarity on word and sentence level exist and have consistently achieved higher detection accuracy for disguised plagiarism forms compared to character-based approaches. However, the high computational effort of these semantic approaches makes them infeasible for use in real-world plagiarism detection scenarios. The proposed hybrid approach uses citation-based methods as a preliminary heuristic to reduce the retrieval space with a relatively low loss in detection accuracy. This preliminary step can then be followed by a computationally more expensive semantic and character-based analysis. We show that such a hybrid approach allows semantic plagiarism detection to become feasible even on large collections for the first time. | eng |
| dc.description.version | published | |
| dc.identifier.doi | 10.1109/JCDL.2014.6970168 | eng |
| dc.identifier.uri | http://kops.uni-konstanz.de/handle/123456789/30317 | |
| dc.language.iso | eng | eng |
| dc.subject | Citation Analysis, Disguised Plagiarism, Information Retrieval, Large Scale Collections, Plagiarism Detection, Semantic Analysis | eng |
| dc.subject.ddc | 004 | eng |
| dc.title | Reducing computational effort for plagiarism detection by using citation characteristics to limit retrieval space | eng |
| dc.type | INPROCEEDINGS | eng |
| dspace.entity.type | Publication | |
| kops.citation.bibtex | @inproceedings{Meuschke2014Reduc-30317,
year={2014},
doi={10.1109/JCDL.2014.6970168},
title={Reducing computational effort for plagiarism detection by using citation characteristics to limit retrieval space},
isbn={978-1-4799-5569-5},
publisher={IEEE},
address={Piscataway, NJ},
booktitle={Proceedings of the 2014 IEEE/ACM Joint Conference on Digital Libraries (JCDL) : 8th - 12th September 2014, City University London, United Kingdom},
pages={197--200},
author={Meuschke, Norman and Gipp, Bela}
} | |
| kops.citation.iso690 | MEUSCHKE, Norman, Bela GIPP, 2014. Reducing computational effort for plagiarism detection by using citation characteristics to limit retrieval space. 2014 IEEE/ACM Joint Conference on Digital Libraries (JCDL). London, 8. Sept. 2014 - 12. Sept. 2014. In: Proceedings of the 2014 IEEE/ACM Joint Conference on Digital Libraries (JCDL) : 8th - 12th September 2014, City University London, United Kingdom. Piscataway, NJ: IEEE, 2014, pp. 197-200. ISBN 978-1-4799-5569-5. Available under: doi: 10.1109/JCDL.2014.6970168 | deu |
| kops.citation.iso690 | MEUSCHKE, Norman, Bela GIPP, 2014. Reducing computational effort for plagiarism detection by using citation characteristics to limit retrieval space. 2014 IEEE/ACM Joint Conference on Digital Libraries (JCDL). London, Sep 8, 2014 - Sep 12, 2014. In: Proceedings of the 2014 IEEE/ACM Joint Conference on Digital Libraries (JCDL) : 8th - 12th September 2014, City University London, United Kingdom. Piscataway, NJ: IEEE, 2014, pp. 197-200. ISBN 978-1-4799-5569-5. Available under: doi: 10.1109/JCDL.2014.6970168 | eng |
| kops.citation.rdf | <rdf:RDF
xmlns:dcterms="http://purl.org/dc/terms/"
xmlns:dc="http://purl.org/dc/elements/1.1/"
xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
xmlns:bibo="http://purl.org/ontology/bibo/"
xmlns:dspace="http://digital-repositories.org/ontologies/dspace/0.1.0#"
xmlns:foaf="http://xmlns.com/foaf/0.1/"
xmlns:void="http://rdfs.org/ns/void#"
xmlns:xsd="http://www.w3.org/2001/XMLSchema#" >
<rdf:Description rdf:about="https://kops.uni-konstanz.de/server/rdf/resource/123456789/30317">
<foaf:homepage rdf:resource="http://localhost:8080/"/>
<dc:date rdf:datatype="http://www.w3.org/2001/XMLSchema#dateTime">2015-03-16T15:48:08Z</dc:date>
<dc:creator>Meuschke, Norman</dc:creator>
<dcterms:available rdf:datatype="http://www.w3.org/2001/XMLSchema#dateTime">2015-03-16T15:48:08Z</dcterms:available>
<dc:language>eng</dc:language>
<dcterms:abstract xml:lang="eng">This paper proposes a hybrid approach to plagiarism detection in academic documents that integrates detection methods using citations, semantic argument structure, and semantic word similarity with character-based methods to achieve a higher detection performance for disguised plagiarism forms. Currently available software for plagiarism detection exclusively performs text string comparisons. These systems find copies, but fail to identify disguised plagiarism, such as paraphrases, translations, or idea plagiarism. Detection approaches that consider semantic similarity on word and sentence level exist and have consistently achieved higher detection accuracy for disguised plagiarism forms compared to character-based approaches. However, the high computational effort of these semantic approaches makes them infeasible for use in real-world plagiarism detection scenarios. The proposed hybrid approach uses citation-based methods as a preliminary heuristic to reduce the retrieval space with a relatively low loss in detection accuracy. This preliminary step can then be followed by a computationally more expensive semantic and character-based analysis. We show that such a hybrid approach allows semantic plagiarism detection to become feasible even on large collections for the first time.</dcterms:abstract>
<void:sparqlEndpoint rdf:resource="http://localhost/fuseki/dspace/sparql"/>
<dc:contributor>Gipp, Bela</dc:contributor>
<dspace:isPartOfCollection rdf:resource="https://kops.uni-konstanz.de/server/rdf/resource/123456789/36"/>
<dcterms:issued>2014</dcterms:issued>
<dcterms:isPartOf rdf:resource="https://kops.uni-konstanz.de/server/rdf/resource/123456789/36"/>
<bibo:uri rdf:resource="http://kops.uni-konstanz.de/handle/123456789/30317"/>
<dc:creator>Gipp, Bela</dc:creator>
<dc:contributor>Meuschke, Norman</dc:contributor>
<dcterms:title>Reducing computational effort for plagiarism detection by using citation characteristics to limit retrieval space</dcterms:title>
</rdf:Description>
</rdf:RDF> | |
| kops.conferencefield | 2014 IEEE/ACM Joint Conference on Digital Libraries (JCDL), 8. Sept. 2014 - 12. Sept. 2014, London | deu |
| kops.date.conferenceEnd | 2014-09-12 | eng |
| kops.date.conferenceStart | 2014-09-08 | eng |
| kops.flag.knbibliography | false | |
| kops.location.conference | London | eng |
| kops.sourcefield | <i>Proceedings of the 2014 IEEE/ACM Joint Conference on Digital Libraries (JCDL) : 8th - 12th September 2014, City University London, United Kingdom</i>. Piscataway, NJ: IEEE, 2014, pp. 197-200. ISBN 978-1-4799-5569-5. Available under: doi: 10.1109/JCDL.2014.6970168 | deu |
| kops.sourcefield.plain | Proceedings of the 2014 IEEE/ACM Joint Conference on Digital Libraries (JCDL) : 8th - 12th September 2014, City University London, United Kingdom. Piscataway, NJ: IEEE, 2014, pp. 197-200. ISBN 978-1-4799-5569-5. Available under: doi: 10.1109/JCDL.2014.6970168 | deu |
| kops.sourcefield.plain | Proceedings of the 2014 IEEE/ACM Joint Conference on Digital Libraries (JCDL) : 8th - 12th September 2014, City University London, United Kingdom. Piscataway, NJ: IEEE, 2014, pp. 197-200. ISBN 978-1-4799-5569-5. Available under: doi: 10.1109/JCDL.2014.6970168 | eng |
| kops.title.conference | 2014 IEEE/ACM Joint Conference on Digital Libraries (JCDL) | eng |
| relation.isAuthorOfPublication | e3f81adb-a670-4c4c-bade-6781b8f996b0 | |
| relation.isAuthorOfPublication | 358ad52f-dab7-4582-bf8e-8adcf477a2d4 | |
| relation.isAuthorOfPublication.latestForDiscovery | e3f81adb-a670-4c4c-bade-6781b8f996b0 | |
| source.bibliographicInfo.fromPage | 197 | eng |
| source.bibliographicInfo.toPage | 200 | eng |
| source.identifier.isbn | 978-1-4799-5569-5 | eng |
| source.publisher | IEEE | eng |
| source.publisher.location | Piscataway, NJ | eng |
| source.title | Proceedings of the 2014 IEEE/ACM Joint Conference on Digital Libraries (JCDL) : 8th - 12th September 2014, City University London, United Kingdom | eng |
| temp.internal.duplicates | <p>Keine Dubletten gefunden. Letzte Überprüfung: 05.03.2015 11:11:02</p> | deu |