Reducing computational effort for plagiarism detection by using citation characteristics to limit retrieval space

dc.contributor.authorMeuschke, Norman
dc.contributor.authorGipp, Bela
dc.date.accessioned2015-03-16T15:48:08Z
dc.date.available2015-03-16T15:48:08Z
dc.date.issued2014eng
dc.description.abstractThis paper proposes a hybrid approach to plagiarism detection in academic documents that integrates detection methods using citations, semantic argument structure, and semantic word similarity with character-based methods to achieve a higher detection performance for disguised plagiarism forms. Currently available software for plagiarism detection exclusively performs text string comparisons. These systems find copies, but fail to identify disguised plagiarism, such as paraphrases, translations, or idea plagiarism. Detection approaches that consider semantic similarity on word and sentence level exist and have consistently achieved higher detection accuracy for disguised plagiarism forms compared to character-based approaches. However, the high computational effort of these semantic approaches makes them infeasible for use in real-world plagiarism detection scenarios. The proposed hybrid approach uses citation-based methods as a preliminary heuristic to reduce the retrieval space with a relatively low loss in detection accuracy. This preliminary step can then be followed by a computationally more expensive semantic and character-based analysis. We show that such a hybrid approach allows semantic plagiarism detection to become feasible even on large collections for the first time.eng
dc.description.versionpublished
dc.identifier.doi10.1109/JCDL.2014.6970168eng
dc.identifier.urihttp://kops.uni-konstanz.de/handle/123456789/30317
dc.language.isoengeng
dc.subjectCitation Analysis, Disguised Plagiarism, Information Retrieval, Large Scale Collections, Plagiarism Detection, Semantic Analysiseng
dc.subject.ddc004eng
dc.titleReducing computational effort for plagiarism detection by using citation characteristics to limit retrieval spaceeng
dc.typeINPROCEEDINGSeng
dspace.entity.typePublication
kops.citation.bibtex
@inproceedings{Meuschke2014Reduc-30317,
  year={2014},
  doi={10.1109/JCDL.2014.6970168},
  title={Reducing computational effort for plagiarism detection by using citation characteristics to limit retrieval space},
  isbn={978-1-4799-5569-5},
  publisher={IEEE},
  address={Piscataway, NJ},
  booktitle={Proceedings of the 2014 IEEE/ACM Joint Conference on Digital Libraries (JCDL) : 8th - 12th September 2014, City University London, United Kingdom},
  pages={197--200},
  author={Meuschke, Norman and Gipp, Bela}
}
kops.citation.iso690MEUSCHKE, Norman, Bela GIPP, 2014. Reducing computational effort for plagiarism detection by using citation characteristics to limit retrieval space. 2014 IEEE/ACM Joint Conference on Digital Libraries (JCDL). London, 8. Sept. 2014 - 12. Sept. 2014. In: Proceedings of the 2014 IEEE/ACM Joint Conference on Digital Libraries (JCDL) : 8th - 12th September 2014, City University London, United Kingdom. Piscataway, NJ: IEEE, 2014, pp. 197-200. ISBN 978-1-4799-5569-5. Available under: doi: 10.1109/JCDL.2014.6970168deu
kops.citation.iso690MEUSCHKE, Norman, Bela GIPP, 2014. Reducing computational effort for plagiarism detection by using citation characteristics to limit retrieval space. 2014 IEEE/ACM Joint Conference on Digital Libraries (JCDL). London, Sep 8, 2014 - Sep 12, 2014. In: Proceedings of the 2014 IEEE/ACM Joint Conference on Digital Libraries (JCDL) : 8th - 12th September 2014, City University London, United Kingdom. Piscataway, NJ: IEEE, 2014, pp. 197-200. ISBN 978-1-4799-5569-5. Available under: doi: 10.1109/JCDL.2014.6970168eng
kops.citation.rdf
<rdf:RDF
    xmlns:dcterms="http://purl.org/dc/terms/"
    xmlns:dc="http://purl.org/dc/elements/1.1/"
    xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
    xmlns:bibo="http://purl.org/ontology/bibo/"
    xmlns:dspace="http://digital-repositories.org/ontologies/dspace/0.1.0#"
    xmlns:foaf="http://xmlns.com/foaf/0.1/"
    xmlns:void="http://rdfs.org/ns/void#"
    xmlns:xsd="http://www.w3.org/2001/XMLSchema#" > 
  <rdf:Description rdf:about="https://kops.uni-konstanz.de/server/rdf/resource/123456789/30317">
    <foaf:homepage rdf:resource="http://localhost:8080/"/>
    <dc:date rdf:datatype="http://www.w3.org/2001/XMLSchema#dateTime">2015-03-16T15:48:08Z</dc:date>
    <dc:creator>Meuschke, Norman</dc:creator>
    <dcterms:available rdf:datatype="http://www.w3.org/2001/XMLSchema#dateTime">2015-03-16T15:48:08Z</dcterms:available>
    <dc:language>eng</dc:language>
    <dcterms:abstract xml:lang="eng">This paper proposes a hybrid approach to plagiarism detection in academic documents that integrates detection methods using citations, semantic argument structure, and semantic word similarity with character-based methods to achieve a higher detection performance for disguised plagiarism forms. Currently available software for plagiarism detection exclusively performs text string comparisons. These systems find copies, but fail to identify disguised plagiarism, such as paraphrases, translations, or idea plagiarism. Detection approaches that consider semantic similarity on word and sentence level exist and have consistently achieved higher detection accuracy for disguised plagiarism forms compared to character-based approaches. However, the high computational effort of these semantic approaches makes them infeasible for use in real-world plagiarism detection scenarios. The proposed hybrid approach uses citation-based methods as a preliminary heuristic to reduce the retrieval space with a relatively low loss in detection accuracy. This preliminary step can then be followed by a computationally more expensive semantic and character-based analysis. We show that such a hybrid approach allows semantic plagiarism detection to become feasible even on large collections for the first time.</dcterms:abstract>
    <void:sparqlEndpoint rdf:resource="http://localhost/fuseki/dspace/sparql"/>
    <dc:contributor>Gipp, Bela</dc:contributor>
    <dspace:isPartOfCollection rdf:resource="https://kops.uni-konstanz.de/server/rdf/resource/123456789/36"/>
    <dcterms:issued>2014</dcterms:issued>
    <dcterms:isPartOf rdf:resource="https://kops.uni-konstanz.de/server/rdf/resource/123456789/36"/>
    <bibo:uri rdf:resource="http://kops.uni-konstanz.de/handle/123456789/30317"/>
    <dc:creator>Gipp, Bela</dc:creator>
    <dc:contributor>Meuschke, Norman</dc:contributor>
    <dcterms:title>Reducing computational effort for plagiarism detection by using citation characteristics to limit retrieval space</dcterms:title>
  </rdf:Description>
</rdf:RDF>
kops.conferencefield2014 IEEE/ACM Joint Conference on Digital Libraries (JCDL), 8. Sept. 2014 - 12. Sept. 2014, Londondeu
kops.date.conferenceEnd2014-09-12eng
kops.date.conferenceStart2014-09-08eng
kops.flag.knbibliographyfalse
kops.location.conferenceLondoneng
kops.sourcefield<i>Proceedings of the 2014 IEEE/ACM Joint Conference on Digital Libraries (JCDL) : 8th - 12th September 2014, City University London, United Kingdom</i>. Piscataway, NJ: IEEE, 2014, pp. 197-200. ISBN 978-1-4799-5569-5. Available under: doi: 10.1109/JCDL.2014.6970168deu
kops.sourcefield.plainProceedings of the 2014 IEEE/ACM Joint Conference on Digital Libraries (JCDL) : 8th - 12th September 2014, City University London, United Kingdom. Piscataway, NJ: IEEE, 2014, pp. 197-200. ISBN 978-1-4799-5569-5. Available under: doi: 10.1109/JCDL.2014.6970168deu
kops.sourcefield.plainProceedings of the 2014 IEEE/ACM Joint Conference on Digital Libraries (JCDL) : 8th - 12th September 2014, City University London, United Kingdom. Piscataway, NJ: IEEE, 2014, pp. 197-200. ISBN 978-1-4799-5569-5. Available under: doi: 10.1109/JCDL.2014.6970168eng
kops.title.conference2014 IEEE/ACM Joint Conference on Digital Libraries (JCDL)eng
relation.isAuthorOfPublicatione3f81adb-a670-4c4c-bade-6781b8f996b0
relation.isAuthorOfPublication358ad52f-dab7-4582-bf8e-8adcf477a2d4
relation.isAuthorOfPublication.latestForDiscoverye3f81adb-a670-4c4c-bade-6781b8f996b0
source.bibliographicInfo.fromPage197eng
source.bibliographicInfo.toPage200eng
source.identifier.isbn978-1-4799-5569-5eng
source.publisherIEEEeng
source.publisher.locationPiscataway, NJeng
source.titleProceedings of the 2014 IEEE/ACM Joint Conference on Digital Libraries (JCDL) : 8th - 12th September 2014, City University London, United Kingdomeng
temp.internal.duplicates<p>Keine Dubletten gefunden. Letzte Überprüfung: 05.03.2015 11:11:02</p>deu

Dateien