Improving the selection of news reports for event coding using ensemble classification

Lade...
Vorschaubild
Dateien
Croicu_0-309570.pdf
Croicu_0-309570.pdfGröße: 544.6 KBDownloads: 333
Datum
2015
Autor:innen
Croicu, Mihai
Herausgeber:innen
Kontakt
ISSN der Zeitschrift
Electronic ISSN
ISBN
Bibliografische Daten
Verlag
Schriftenreihe
Auflagebezeichnung
ArXiv-ID
Internationale Patentnummer
Angaben zur Forschungsförderung
Projekt
Sofja Kovalevskaja-Preis: The Web as a Curse or Blessing? Ethnic Mobilization in the Information Age
Open Access-Veröffentlichung
Open Access Gold
Core Facility der Universität Konstanz
Gesperrt bis
Titel in einer weiteren Sprache
Publikationstyp
Zeitschriftenartikel
Publikationsstatus
Published
Erschienen in
Research and Politics. 2015, 2(4). eISSN 2053-1680. Available under: doi: 10.1177/2053168015615596
Zusammenfassung

Manual coding of political events from news reports is extremely expensive and time-consuming, whereas completely automatic coding has limitations when it comes to the precision and granularity of the data collected. In this paper, we introduce an alternative strategy by establishing a semi-automatic pipeline, where an automatic classification system eliminates irrelevant source material before further coding is done by humans. Our pipeline relies on a high-performance supervised heterogeneous ensemble classifier working on extremely unbalanced training classes. Deployed to the Mass Mobilization on Autocracies database on protest, the system is able to reduce the number of source articles to be human-coded by more than half, while keeping over 90% of the relevant material.

Zusammenfassung in einer weiteren Sprache
Fachgebiet (DDC)
320 Politik
Schlagwörter
Konferenz
Rezension
undefined / . - undefined, undefined
Forschungsvorhaben
Organisationseinheiten
Zeitschriftenheft
Datensätze
Zitieren
ISO 690CROICU, Mihai, Nils B. WEIDMANN, 2015. Improving the selection of news reports for event coding using ensemble classification. In: Research and Politics. 2015, 2(4). eISSN 2053-1680. Available under: doi: 10.1177/2053168015615596
BibTex
@article{Croicu2015Impro-32815,
  year={2015},
  doi={10.1177/2053168015615596},
  title={Improving the selection of news reports for event coding using ensemble classification},
  number={4},
  volume={2},
  journal={Research and Politics},
  author={Croicu, Mihai and Weidmann, Nils B.}
}
RDF
<rdf:RDF
    xmlns:dcterms="http://purl.org/dc/terms/"
    xmlns:dc="http://purl.org/dc/elements/1.1/"
    xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
    xmlns:bibo="http://purl.org/ontology/bibo/"
    xmlns:dspace="http://digital-repositories.org/ontologies/dspace/0.1.0#"
    xmlns:foaf="http://xmlns.com/foaf/0.1/"
    xmlns:void="http://rdfs.org/ns/void#"
    xmlns:xsd="http://www.w3.org/2001/XMLSchema#" > 
  <rdf:Description rdf:about="https://kops.uni-konstanz.de/server/rdf/resource/123456789/32815">
    <dc:creator>Croicu, Mihai</dc:creator>
    <dc:date rdf:datatype="http://www.w3.org/2001/XMLSchema#dateTime">2016-02-02T14:58:41Z</dc:date>
    <dcterms:isPartOf rdf:resource="https://kops.uni-konstanz.de/server/rdf/resource/123456789/42"/>
    <dcterms:abstract xml:lang="eng">Manual coding of political events from news reports is extremely expensive and time-consuming, whereas completely automatic coding has limitations when it comes to the precision and granularity of the data collected. In this paper, we introduce an alternative strategy by establishing a semi-automatic pipeline, where an automatic classification system eliminates irrelevant source material before further coding is done by humans. Our pipeline relies on a high-performance supervised heterogeneous ensemble classifier working on extremely unbalanced training classes. Deployed to the Mass Mobilization on Autocracies database on protest, the system is able to reduce the number of source articles to be human-coded by more than half, while keeping over 90% of the relevant material.</dcterms:abstract>
    <dc:creator>Weidmann, Nils B.</dc:creator>
    <dspace:hasBitstream rdf:resource="https://kops.uni-konstanz.de/bitstream/123456789/32815/3/Croicu_0-309570.pdf"/>
    <dspace:isPartOfCollection rdf:resource="https://kops.uni-konstanz.de/server/rdf/resource/123456789/42"/>
    <dcterms:issued>2015</dcterms:issued>
    <dcterms:hasPart rdf:resource="https://kops.uni-konstanz.de/bitstream/123456789/32815/3/Croicu_0-309570.pdf"/>
    <void:sparqlEndpoint rdf:resource="http://localhost/fuseki/dspace/sparql"/>
    <dc:contributor>Croicu, Mihai</dc:contributor>
    <dspace:isPartOfCollection rdf:resource="https://kops.uni-konstanz.de/server/rdf/resource/123456789/52"/>
    <dcterms:isPartOf rdf:resource="https://kops.uni-konstanz.de/server/rdf/resource/123456789/52"/>
    <dcterms:title>Improving the selection of news reports for event coding using ensemble classification</dcterms:title>
    <bibo:uri rdf:resource="https://kops.uni-konstanz.de/handle/123456789/32815"/>
    <dc:rights>Attribution-NonCommercial 3.0 Unported</dc:rights>
    <dc:language>eng</dc:language>
    <foaf:homepage rdf:resource="http://localhost:8080/"/>
    <dcterms:available rdf:datatype="http://www.w3.org/2001/XMLSchema#dateTime">2016-02-02T14:58:41Z</dcterms:available>
    <dc:contributor>Weidmann, Nils B.</dc:contributor>
    <dcterms:rights rdf:resource="http://creativecommons.org/licenses/by-nc/3.0/"/>
  </rdf:Description>
</rdf:RDF>
Interner Vermerk
xmlui.Submission.submit.DescribeStep.inputForms.label.kops_note_fromSubmitter
Kontakt
URL der Originalveröffentl.
Prüfdatum der URL
Prüfungsdatum der Dissertation
Finanzierungsart
Kommentar zur Publikation
Allianzlizenz
Corresponding Authors der Uni Konstanz vorhanden
Internationale Co-Autor:innen
Universitätsbibliographie
Ja
Begutachtet
Diese Publikation teilen