Datensatz: Wikipedia Edit Event Data 2021 (WikiEvent.2021)
Datum der Erstveröffentlichung
Autor:innen
Andere Beitragende
Repositorium der Erstveröffentlichung
Version des Datensatzes
DOI (Link zu den Daten)
Link zur Lizenz
Angaben zur Forschungsförderung
Projekt
Core Facility der Universität Konstanz
Titel in einer weiteren Sprache
Publikationsstatus
Zusammenfassung
The "Wikipedia Edit Event Data 2021 (WikiEvent.2021)" gives the time, user name, and article title of every edit that any registered and logged-in Wikipedia user performed on any article in the English-language edition of Wikipedia from January 15th, 2001 (the launch of Wikipedia) to January 2021. This dataset extends the older version WikiEven.2018 (https://zenodo.org/record/1626323). The edit event data has been extracted from the file 'enwiki-20210101-stub-meta-history.xml.gz'; which was at that time linked from 'https://dumps.wikimedia.org/enwiki/20210101/'. These files get deleted some months after data collection - however the information is still available in any file 'enwiki-<date>-stub-meta-history.xml.gz' where is 20210101 or later. These data are provided by the Wikimedia Foundation licensed under the GNU Free Documentation License (GFDL) and the Creative Commons Attribution-Share-Alike 3.0 License. The Wikipedia Edit Event Data 2021 comprises the file ('WikiEvent.2021.csv') giving a table with 3 columns and more than 450 million rows in CSV format. Cell delimiter is semicolon (';') and strings are quoted by double-quotes ('"'). The table has a header given in the first row and the three columns are labeled 'time', 'user', and 'article' respectively. The uncompressed size of the file is about 23 GB.
How to analyze the WikiEvent Data with relational event models is explained in the eventnet tutorial at: https://github.com/juergenlerner/eventnet/wiki/Large-event-networks-(tutorial).
Zusammenfassung in einer weiteren Sprache
Fachgebiet (DDC)
Schlagwörter
Zitieren
ISO 690
LERNER, Jürgen, 2021. Wikipedia Edit Event Data 2021 (WikiEvent.2021)BibTex
RDF
<rdf:RDF
xmlns:dcterms="http://purl.org/dc/terms/"
xmlns:dc="http://purl.org/dc/elements/1.1/"
xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
xmlns:bibo="http://purl.org/ontology/bibo/"
xmlns:dspace="http://digital-repositories.org/ontologies/dspace/0.1.0#"
xmlns:foaf="http://xmlns.com/foaf/0.1/"
xmlns:void="http://rdfs.org/ns/void#"
xmlns:xsd="http://www.w3.org/2001/XMLSchema#" >
<rdf:Description rdf:about="https://kops.uni-konstanz.de/server/rdf/resource/123456789/73793">
<dc:language>eng</dc:language>
<dspace:isPartOfCollection rdf:resource="https://kops.uni-konstanz.de/server/rdf/resource/123456789/71925"/>
<dc:rights>Creative Commons Attribution 4.0 International</dc:rights>
<dcterms:title>Wikipedia Edit Event Data 2021 (WikiEvent.2021)</dcterms:title>
<bibo:uri rdf:resource="https://kops.uni-konstanz.de/handle/123456789/73793"/>
<dc:creator>Lerner, Jürgen</dc:creator>
<dcterms:abstract>The "Wikipedia Edit Event Data 2021 (WikiEvent.2021)" gives the time, user name, and article title of every edit that any registered and logged-in Wikipedia user performed on any article in the English-language edition of Wikipedia from January 15th, 2001 (the launch of Wikipedia) to January 2021.
This dataset extends the older version WikiEven.2018 (https://zenodo.org/record/1626323).
The edit event data has been extracted from the file 'enwiki-20210101-stub-meta-history.xml.gz'; which was at that time linked from 'https://dumps.wikimedia.org/enwiki/20210101/'. These files get deleted some months after data collection - however the information is still available in any file 'enwiki-&lt;date&gt;-stub-meta-history.xml.gz' where <date> is 20210101 or later. These data are provided by the Wikimedia Foundation licensed under the GNU Free Documentation License (GFDL) and the Creative Commons Attribution-Share-Alike 3.0 License. The Wikipedia Edit Event Data 2021 comprises the file ('WikiEvent.2021.csv') giving a table with 3 columns and more than 450 million rows in CSV format. Cell delimiter is semicolon (';') and strings are quoted by double-quotes ('"'). The table has a header given in the first row and the three columns are labeled 'time', 'user', and 'article' respectively. The uncompressed size of the file is about 23 GB.
How to analyze the WikiEvent Data with relational event models is explained in the eventnet tutorial at: https://github.com/juergenlerner/eventnet/wiki/Large-event-networks-(tutorial).</dcterms:abstract>
<dcterms:rights rdf:resource="https://creativecommons.org/licenses/by/4.0/legalcode"/>
<dcterms:issued>2021</dcterms:issued>
<dcterms:available rdf:datatype="http://www.w3.org/2001/XMLSchema#dateTime">2025-07-03T10:45:46Z</dcterms:available>
<void:sparqlEndpoint rdf:resource="http://localhost/fuseki/dspace/sparql"/>
<dcterms:created rdf:datatype="http://www.w3.org/2001/XMLSchema#dateTime">2021-02-09T09:28:05Z</dcterms:created>
<dc:date rdf:datatype="http://www.w3.org/2001/XMLSchema#dateTime">2025-07-03T10:45:46Z</dc:date>
<foaf:homepage rdf:resource="http://localhost:8080/"/>
<dc:contributor>Lerner, Jürgen</dc:contributor>
<dcterms:isPartOf rdf:resource="https://kops.uni-konstanz.de/server/rdf/resource/123456789/71925"/>
</rdf:Description>
</rdf:RDF>