Publikation:

Evaluation of header metadata extraction approaches and tools for scientific PDF documents

Lade...
Vorschaubild

Dateien

Lipinski_0-285622.pdf
Lipinski_0-285622.pdfGröße: 690.58 KBDownloads: 1384

Datum

2013

Autor:innen

Lipinski, Mario
Yao, Kevin
Beel, Joeran

Herausgeber:innen

Kontakt

ISSN der Zeitschrift

Electronic ISSN

ISBN

Bibliografische Daten

Verlag

Schriftenreihe

Auflagebezeichnung

ArXiv-ID

Internationale Patentnummer

Angaben zur Forschungsförderung

Projekt

Open Access-Veröffentlichung
Open Access Green
Core Facility der Universität Konstanz

Gesperrt bis

Titel in einer weiteren Sprache

Publikationstyp
Beitrag zu einem Konferenzband
Publikationsstatus
Published

Erschienen in

J. STEPHEN DOWNIE, , ed.. Proceedings of the 13th ACM/IEEE-CS joint conference on Digital libraries. New York: ACM, 2013, pp. 385-386. ISBN 978-1-4503-2077-1. Available under: doi: 10.1145/2467696.2467753

Zusammenfassung

This paper evaluates the performance of tools for the extraction of metadata from scientific articles. Accurate metadata extraction is an important task for automating the management of digital libraries. This comparative study is a guide for developers looking to integrate the most suitable and effective metadata extraction tool into their software. We shed light on the strengths and weaknesses of seven tools in common use. In our evaluation using papers from the arXiv collection, GROBID delivered the best results, followed by Mendeley Desktop. SciPlore Xtract, PDFMeat, and SVMHeaderParse also delivered good results depending on the metadata type to be extracted.

Zusammenfassung in einer weiteren Sprache

Fachgebiet (DDC)
004 Informatik

Schlagwörter

Information Retrieval, Metadata Extraction, Evaluation, PDF

Konferenz

JCDL '13, 22. Juli 2013 - 26. Juli 2013, Indianapolis
Rezension
undefined / . - undefined, undefined

Forschungsvorhaben

Organisationseinheiten

Zeitschriftenheft

Zugehörige Datensätze in KOPS

Zitieren

ISO 690LIPINSKI, Mario, Kevin YAO, Corinna BREITINGER, Joeran BEEL, Bela GIPP, 2013. Evaluation of header metadata extraction approaches and tools for scientific PDF documents. JCDL '13. Indianapolis, 22. Juli 2013 - 26. Juli 2013. In: J. STEPHEN DOWNIE, , ed.. Proceedings of the 13th ACM/IEEE-CS joint conference on Digital libraries. New York: ACM, 2013, pp. 385-386. ISBN 978-1-4503-2077-1. Available under: doi: 10.1145/2467696.2467753
BibTex
@inproceedings{Lipinski2013Evalu-30950,
  year={2013},
  doi={10.1145/2467696.2467753},
  title={Evaluation of header metadata extraction approaches and tools for scientific PDF documents},
  isbn={978-1-4503-2077-1},
  publisher={ACM},
  address={New York},
  booktitle={Proceedings of the 13th ACM/IEEE-CS joint conference on Digital libraries},
  pages={385--386},
  editor={J. Stephen Downie},
  author={Lipinski, Mario and Yao, Kevin and Breitinger, Corinna and Beel, Joeran and Gipp, Bela}
}
RDF
<rdf:RDF
    xmlns:dcterms="http://purl.org/dc/terms/"
    xmlns:dc="http://purl.org/dc/elements/1.1/"
    xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
    xmlns:bibo="http://purl.org/ontology/bibo/"
    xmlns:dspace="http://digital-repositories.org/ontologies/dspace/0.1.0#"
    xmlns:foaf="http://xmlns.com/foaf/0.1/"
    xmlns:void="http://rdfs.org/ns/void#"
    xmlns:xsd="http://www.w3.org/2001/XMLSchema#" > 
  <rdf:Description rdf:about="https://kops.uni-konstanz.de/server/rdf/resource/123456789/30950">
    <dc:contributor>Yao, Kevin</dc:contributor>
    <dcterms:abstract xml:lang="eng">This paper evaluates the performance of tools for the extraction of metadata from scientific articles. Accurate metadata extraction is an important task for automating the management of digital libraries. This comparative study is a guide for developers looking to integrate the most suitable and effective metadata extraction tool into their software. We shed light on the strengths and weaknesses of seven tools in common use. In our evaluation using papers from the arXiv collection, GROBID delivered the best results, followed by Mendeley Desktop. SciPlore Xtract, PDFMeat, and SVMHeaderParse also delivered good results depending on the metadata type to be extracted.</dcterms:abstract>
    <dcterms:issued>2013</dcterms:issued>
    <dspace:isPartOfCollection rdf:resource="https://kops.uni-konstanz.de/server/rdf/resource/123456789/36"/>
    <dc:rights>terms-of-use</dc:rights>
    <dcterms:rights rdf:resource="https://rightsstatements.org/page/InC/1.0/"/>
    <dspace:hasBitstream rdf:resource="https://kops.uni-konstanz.de/bitstream/123456789/30950/1/Lipinski_0-285622.pdf"/>
    <dcterms:available rdf:datatype="http://www.w3.org/2001/XMLSchema#dateTime">2015-05-13T09:26:12Z</dcterms:available>
    <dc:contributor>Beel, Joeran</dc:contributor>
    <dcterms:isPartOf rdf:resource="https://kops.uni-konstanz.de/server/rdf/resource/123456789/36"/>
    <dcterms:title>Evaluation of header metadata extraction approaches and tools for scientific PDF documents</dcterms:title>
    <bibo:uri rdf:resource="http://kops.uni-konstanz.de/handle/123456789/30950"/>
    <dc:date rdf:datatype="http://www.w3.org/2001/XMLSchema#dateTime">2015-05-13T09:26:12Z</dc:date>
    <dc:contributor>Gipp, Bela</dc:contributor>
    <dc:contributor>Breitinger, Corinna</dc:contributor>
    <dc:contributor>Lipinski, Mario</dc:contributor>
    <dc:creator>Breitinger, Corinna</dc:creator>
    <dc:creator>Gipp, Bela</dc:creator>
    <dc:language>eng</dc:language>
    <foaf:homepage rdf:resource="http://localhost:8080/"/>
    <void:sparqlEndpoint rdf:resource="http://localhost/fuseki/dspace/sparql"/>
    <dcterms:hasPart rdf:resource="https://kops.uni-konstanz.de/bitstream/123456789/30950/1/Lipinski_0-285622.pdf"/>
    <dc:creator>Beel, Joeran</dc:creator>
    <dc:creator>Yao, Kevin</dc:creator>
    <dc:creator>Lipinski, Mario</dc:creator>
  </rdf:Description>
</rdf:RDF>

Interner Vermerk

xmlui.Submission.submit.DescribeStep.inputForms.label.kops_note_fromSubmitter

Kontakt
URL der Originalveröffentl.

Prüfdatum der URL

Prüfungsdatum der Dissertation

Finanzierungsart

Kommentar zur Publikation

Allianzlizenz
Corresponding Authors der Uni Konstanz vorhanden
Internationale Co-Autor:innen
Universitätsbibliographie
Nein
Begutachtet
Diese Publikation teilen