Improving the Representation and Conversion of Mathematical Formulae by Considering their Textual Context

dc.contributor.authorSchubotz, Moritz
dc.contributor.authorGreiner-Petter, André
dc.contributor.authorScharpf, Philipp
dc.contributor.authorMeuschke, Norman
dc.contributor.authorCohl, Howard S.
dc.contributor.authorGipp, Bela
dc.date.accessioned2018-09-17T13:53:57Z
dc.date.available2018-09-17T13:53:57Z
dc.date.issued2018-04-13eng
dc.description.abstractMathematical formulae represent complex semantic information in a concise form. Especially in Science, Technology, Engineering, and Mathematics, mathematical formulae are crucial to communicate information, e.g., in scientific papers, and to perform computations using computer algebra systems. Enabling computers to access the information encoded in mathematical formulae requires machine-readable formats that can represent both the presentation and content, i.e., the semantics, of formulae. Exchanging such information between systems additionally requires conversion methods for mathematical representation formats. We analyze how the semantic enrichment of formulae improves the format conversion process and show that considering the textual context of formulae reduces the error rate of such conversions. Our main contributions are: (1) providing an openly available benchmark dataset for the mathematical format conversion task consisting of a newly created test collection, an extensive, manually curated gold standard and task-specific evaluation metrics; (2) performing a quantitative evaluation of state-of-the-art tools for mathematical format conversions; (3) presenting a new approach that considers the textual context of formulae to reduce the error rate for mathematical format conversions. Our benchmark dataset facilitates future research on mathematical format conversions as well as research on many problems in mathematical information retrieval. Because we annotated and linked all components of formulae, e.g., identifiers, operators and other entities, to Wikidata entries, the gold standard can, for instance, be used to train methods for formula concept discovery and recognition. Such methods can then be applied to improve mathematical information retrieval systems, e.g., for semantic formula search, recommendation of mathematical content, or detection of mathematical plagiarism.eng
dc.description.versionpublishedeng
dc.identifier.arxiv1804.04956eng
dc.identifier.doi10.1145/3197026.3197058eng
dc.identifier.ppn1664963669
dc.identifier.urihttps://kops.uni-konstanz.de/handle/123456789/43286
dc.language.isoengeng
dc.rightsterms-of-use
dc.rights.urihttps://rightsstatements.org/page/InC/1.0/
dc.subject.ddc004eng
dc.titleImproving the Representation and Conversion of Mathematical Formulae by Considering their Textual Contexteng
dc.typeINPROCEEDINGSeng
dspace.entity.typePublication
kops.citation.bibtex
@inproceedings{Schubotz2018-04-13Impro-43286,
  year={2018},
  doi={10.1145/3197026.3197058},
  title={Improving the Representation and Conversion of Mathematical Formulae by Considering their Textual Context},
  isbn={978-1-4503-5178-2},
  publisher={ACM Press},
  address={New York},
  booktitle={Proceedings of the 18th ACM/IEEE on Joint Conference on Digital Libraries  - JCDL '18},
  pages={233--242},
  editor={Chen, Jiangping},
  author={Schubotz, Moritz and Greiner-Petter, André and Scharpf, Philipp and Meuschke, Norman and Cohl, Howard S. and Gipp, Bela}
}
kops.citation.iso690SCHUBOTZ, Moritz, André GREINER-PETTER, Philipp SCHARPF, Norman MEUSCHKE, Howard S. COHL, Bela GIPP, 2018. Improving the Representation and Conversion of Mathematical Formulae by Considering their Textual Context. 18th ACM/IEEE on Joint Conference on Digital Libraries. Fort Worth, USA, 3. Juni 2018 - 7. Juni 2018. In: CHEN, Jiangping, ed. and others. Proceedings of the 18th ACM/IEEE on Joint Conference on Digital Libraries - JCDL '18. New York: ACM Press, 2018, pp. 233-242. ISBN 978-1-4503-5178-2. Available under: doi: 10.1145/3197026.3197058deu
kops.citation.iso690SCHUBOTZ, Moritz, André GREINER-PETTER, Philipp SCHARPF, Norman MEUSCHKE, Howard S. COHL, Bela GIPP, 2018. Improving the Representation and Conversion of Mathematical Formulae by Considering their Textual Context. 18th ACM/IEEE on Joint Conference on Digital Libraries. Fort Worth, USA, Jun 3, 2018 - Jun 7, 2018. In: CHEN, Jiangping, ed. and others. Proceedings of the 18th ACM/IEEE on Joint Conference on Digital Libraries - JCDL '18. New York: ACM Press, 2018, pp. 233-242. ISBN 978-1-4503-5178-2. Available under: doi: 10.1145/3197026.3197058eng
kops.citation.rdf
<rdf:RDF
    xmlns:dcterms="http://purl.org/dc/terms/"
    xmlns:dc="http://purl.org/dc/elements/1.1/"
    xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
    xmlns:bibo="http://purl.org/ontology/bibo/"
    xmlns:dspace="http://digital-repositories.org/ontologies/dspace/0.1.0#"
    xmlns:foaf="http://xmlns.com/foaf/0.1/"
    xmlns:void="http://rdfs.org/ns/void#"
    xmlns:xsd="http://www.w3.org/2001/XMLSchema#" > 
  <rdf:Description rdf:about="https://kops.uni-konstanz.de/server/rdf/resource/123456789/43286">
    <dc:creator>Scharpf, Philipp</dc:creator>
    <dc:creator>Gipp, Bela</dc:creator>
    <foaf:homepage rdf:resource="http://localhost:8080/"/>
    <dspace:isPartOfCollection rdf:resource="https://kops.uni-konstanz.de/server/rdf/resource/123456789/36"/>
    <dcterms:isPartOf rdf:resource="https://kops.uni-konstanz.de/server/rdf/resource/123456789/36"/>
    <dc:contributor>Gipp, Bela</dc:contributor>
    <dcterms:issued>2018-04-13</dcterms:issued>
    <dc:creator>Greiner-Petter, André</dc:creator>
    <dc:contributor>Scharpf, Philipp</dc:contributor>
    <void:sparqlEndpoint rdf:resource="http://localhost/fuseki/dspace/sparql"/>
    <bibo:uri rdf:resource="https://kops.uni-konstanz.de/handle/123456789/43286"/>
    <dcterms:abstract xml:lang="eng">Mathematical formulae represent complex semantic information in a concise form. Especially in Science, Technology, Engineering, and Mathematics, mathematical formulae are crucial to communicate information, e.g., in scientific papers, and to perform computations using computer algebra systems. Enabling computers to access the information encoded in mathematical formulae requires machine-readable formats that can represent both the presentation and content, i.e., the semantics, of formulae. Exchanging such information between systems additionally requires conversion methods for mathematical representation formats. We analyze how the semantic enrichment of formulae improves the format conversion process and show that considering the textual context of formulae reduces the error rate of such conversions. Our main contributions are: (1) providing an openly available benchmark dataset for the mathematical format conversion task consisting of a newly created test collection, an extensive, manually curated gold standard and task-specific evaluation metrics; (2) performing a quantitative evaluation of state-of-the-art tools for mathematical format conversions; (3) presenting a new approach that considers the textual context of formulae to reduce the error rate for mathematical format conversions. Our benchmark dataset facilitates future research on mathematical format conversions as well as research on many problems in mathematical information retrieval. Because we annotated and linked all components of formulae, e.g., identifiers, operators and other entities, to Wikidata entries, the gold standard can, for instance, be used to train methods for formula concept discovery and recognition. Such methods can then be applied to improve mathematical information retrieval systems, e.g., for semantic formula search, recommendation of mathematical content, or detection of mathematical plagiarism.</dcterms:abstract>
    <dc:creator>Cohl, Howard S.</dc:creator>
    <dc:date rdf:datatype="http://www.w3.org/2001/XMLSchema#dateTime">2018-09-17T13:53:57Z</dc:date>
    <dc:creator>Schubotz, Moritz</dc:creator>
    <dcterms:rights rdf:resource="https://rightsstatements.org/page/InC/1.0/"/>
    <dspace:hasBitstream rdf:resource="https://kops.uni-konstanz.de/bitstream/123456789/43286/1/Schubotz_2-jm8xs95avu406.pdf"/>
    <dc:language>eng</dc:language>
    <dc:contributor>Cohl, Howard S.</dc:contributor>
    <dc:rights>terms-of-use</dc:rights>
    <dcterms:hasPart rdf:resource="https://kops.uni-konstanz.de/bitstream/123456789/43286/1/Schubotz_2-jm8xs95avu406.pdf"/>
    <dcterms:title>Improving the Representation and Conversion of Mathematical Formulae by Considering their Textual Context</dcterms:title>
    <dc:contributor>Schubotz, Moritz</dc:contributor>
    <dc:creator>Meuschke, Norman</dc:creator>
    <dcterms:available rdf:datatype="http://www.w3.org/2001/XMLSchema#dateTime">2018-09-17T13:53:57Z</dcterms:available>
    <dc:contributor>Greiner-Petter, André</dc:contributor>
    <dc:contributor>Meuschke, Norman</dc:contributor>
  </rdf:Description>
</rdf:RDF>
kops.conferencefield18th ACM/IEEE on Joint Conference on Digital Libraries, 3. Juni 2018 - 7. Juni 2018, Fort Worth, USAdeu
kops.date.conferenceEnd2018-06-07eng
kops.date.conferenceStart2018-06-03eng
kops.description.openAccessopenaccessgreen
kops.flag.knbibliographytrue
kops.identifier.nbnurn:nbn:de:bsz:352-2-jm8xs95avu406
kops.location.conferenceFort Worth, USAeng
kops.sourcefieldCHEN, Jiangping, ed. and others. <i>Proceedings of the 18th ACM/IEEE on Joint Conference on Digital Libraries - JCDL '18</i>. New York: ACM Press, 2018, pp. 233-242. ISBN 978-1-4503-5178-2. Available under: doi: 10.1145/3197026.3197058deu
kops.sourcefield.plainCHEN, Jiangping, ed. and others. Proceedings of the 18th ACM/IEEE on Joint Conference on Digital Libraries - JCDL '18. New York: ACM Press, 2018, pp. 233-242. ISBN 978-1-4503-5178-2. Available under: doi: 10.1145/3197026.3197058deu
kops.sourcefield.plainCHEN, Jiangping, ed. and others. Proceedings of the 18th ACM/IEEE on Joint Conference on Digital Libraries - JCDL '18. New York: ACM Press, 2018, pp. 233-242. ISBN 978-1-4503-5178-2. Available under: doi: 10.1145/3197026.3197058eng
kops.title.conference18th ACM/IEEE on Joint Conference on Digital Librarieseng
relation.isAuthorOfPublication63951a1b-b477-40b3-acbb-e5d19d711255
relation.isAuthorOfPublication077de6dd-ae51-4cbf-9ae7-169d00444835
relation.isAuthorOfPublicationa686b647-9b67-4b55-ad38-d21312e357ca
relation.isAuthorOfPublicatione3f81adb-a670-4c4c-bade-6781b8f996b0
relation.isAuthorOfPublication358ad52f-dab7-4582-bf8e-8adcf477a2d4
relation.isAuthorOfPublication.latestForDiscovery63951a1b-b477-40b3-acbb-e5d19d711255
source.bibliographicInfo.fromPage233eng
source.bibliographicInfo.toPage242eng
source.contributor.editorChen, Jiangping
source.flag.etalEditortrueeng
source.identifier.isbn978-1-4503-5178-2eng
source.publisherACM Presseng
source.publisher.locationNew Yorkeng
source.titleProceedings of the 18th ACM/IEEE on Joint Conference on Digital Libraries - JCDL '18eng

Dateien

Originalbündel

Gerade angezeigt 1 - 1 von 1
Vorschaubild nicht verfügbar
Name:
Schubotz_2-jm8xs95avu406.pdf
Größe:
600.03 KB
Format:
Adobe Portable Document Format
Beschreibung:
Schubotz_2-jm8xs95avu406.pdf
Schubotz_2-jm8xs95avu406.pdfGröße: 600.03 KBDownloads: 372