Publikation:

Automated identification of bias inducing words in news articles using linguistic and context-oriented features

Lade...
Vorschaubild

Dateien

Spinde_2-5futcux6vkvc4.pdf
Spinde_2-5futcux6vkvc4.pdfGröße: 1.3 MBDownloads: 577

Datum

2021

Autor:innen

Herausgeber:innen

Kontakt

ISSN der Zeitschrift

Electronic ISSN

ISBN

Bibliografische Daten

Verlag

Schriftenreihe

Auflagebezeichnung

ArXiv-ID

Internationale Patentnummer

Link zur Lizenz

Angaben zur Forschungsförderung

Projekt

Open Access-Veröffentlichung
Open Access Hybrid
Core Facility der Universität Konstanz

Gesperrt bis

Titel in einer weiteren Sprache

Publikationstyp
Zeitschriftenartikel
Publikationsstatus
Published

Erschienen in

Information Processing & Management. Elsevier. 2021, 58(3), 102505. ISSN 0306-4573. eISSN 1873-5371. Available under: doi: 10.1016/j.ipm.2021.102505

Zusammenfassung

Media has a substantial impact on public perception of events, and, accordingly, the way media presents events can potentially alter the beliefs and views of the public. One of the ways in which bias in news articles can be introduced is by altering word choice. Such a form of bias is very challenging to identify automatically due to the high context-dependence and the lack of a large-scale gold-standard data set. In this paper, we present a prototypical yet robust and diverse data set for media bias research. It consists of 1,700 statements representing various media bias instances and contains labels for media bias identification on the word and sentence level. In contrast to existing research, our data incorporate background information on the participants’ demographics, political ideology, and their opinion about media in general. Based on our data, we also present a way to detect bias-inducing words in news articles automatically. Our approach is feature-oriented, which provides a strong descriptive and explanatory power compared to deep learning techniques. We identify and engineer various linguistic, lexical, and syntactic features that can potentially be media bias indicators. Our resource collection is the most complete within the media bias research area to the best of our knowledge. We evaluate all of our features in various combinations and retrieve their possible importance both for future research and for the task in general. We also evaluate various possible Machine Learning approaches with all of our features. XGBoost, a decision tree implementation, yields the best results. Our approach achieves an F1-score of 0.43, a precision of 0.29, a recall of 0.77, and a ROC AUC of 0.79, which outperforms current media bias detection methods based on features. We propose future improvements, discuss the perspectives of the feature-based approach and a combination of neural networks and deep learning with our current system.

Zusammenfassung in einer weiteren Sprache

Fachgebiet (DDC)
004 Informatik

Schlagwörter

Media bias, Feature engineering, Text analysis, Context analysis, News analysis, Bias data set

Konferenz

Rezension
undefined / . - undefined, undefined

Forschungsvorhaben

Organisationseinheiten

Zeitschriftenheft

Zugehörige Datensätze in KOPS

Zitieren

ISO 690SPINDE, Timo, Lada RUDNITCKAIA, Jelena MITROVIĆ, Felix HAMBORG, Michael GRANITZER, Bela GIPP, Karsten DONNAY, 2021. Automated identification of bias inducing words in news articles using linguistic and context-oriented features. In: Information Processing & Management. Elsevier. 2021, 58(3), 102505. ISSN 0306-4573. eISSN 1873-5371. Available under: doi: 10.1016/j.ipm.2021.102505
BibTex
@article{Spinde2021-05Autom-52980,
  year={2021},
  doi={10.1016/j.ipm.2021.102505},
  title={Automated identification of bias inducing words in news articles using linguistic and context-oriented features},
  number={3},
  volume={58},
  issn={0306-4573},
  journal={Information Processing & Management},
  author={Spinde, Timo and Rudnitckaia, Lada and Mitrović, Jelena and Hamborg, Felix and Granitzer, Michael and Gipp, Bela and Donnay, Karsten},
  note={Article Number: 102505}
}
RDF
<rdf:RDF
    xmlns:dcterms="http://purl.org/dc/terms/"
    xmlns:dc="http://purl.org/dc/elements/1.1/"
    xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
    xmlns:bibo="http://purl.org/ontology/bibo/"
    xmlns:dspace="http://digital-repositories.org/ontologies/dspace/0.1.0#"
    xmlns:foaf="http://xmlns.com/foaf/0.1/"
    xmlns:void="http://rdfs.org/ns/void#"
    xmlns:xsd="http://www.w3.org/2001/XMLSchema#" > 
  <rdf:Description rdf:about="https://kops.uni-konstanz.de/server/rdf/resource/123456789/52980">
    <dcterms:hasPart rdf:resource="https://kops.uni-konstanz.de/bitstream/123456789/52980/1/Spinde_2-5futcux6vkvc4.pdf"/>
    <dc:contributor>Donnay, Karsten</dc:contributor>
    <dcterms:title>Automated identification of bias inducing words in news articles using linguistic and context-oriented features</dcterms:title>
    <dspace:hasBitstream rdf:resource="https://kops.uni-konstanz.de/bitstream/123456789/52980/1/Spinde_2-5futcux6vkvc4.pdf"/>
    <void:sparqlEndpoint rdf:resource="http://localhost/fuseki/dspace/sparql"/>
    <dc:creator>Donnay, Karsten</dc:creator>
    <bibo:uri rdf:resource="https://kops.uni-konstanz.de/handle/123456789/52980"/>
    <dc:contributor>Granitzer, Michael</dc:contributor>
    <dc:creator>Hamborg, Felix</dc:creator>
    <dc:contributor>Hamborg, Felix</dc:contributor>
    <dc:creator>Granitzer, Michael</dc:creator>
    <dc:contributor>Mitrović, Jelena</dc:contributor>
    <dc:contributor>Gipp, Bela</dc:contributor>
    <dspace:isPartOfCollection rdf:resource="https://kops.uni-konstanz.de/server/rdf/resource/123456789/36"/>
    <dc:contributor>Spinde, Timo</dc:contributor>
    <dcterms:rights rdf:resource="http://creativecommons.org/licenses/by-nc-nd/4.0/"/>
    <dc:creator>Spinde, Timo</dc:creator>
    <dcterms:abstract xml:lang="eng">Media has a substantial impact on public perception of events, and, accordingly, the way media presents events can potentially alter the beliefs and views of the public. One of the ways in which bias in news articles can be introduced is by altering word choice. Such a form of bias is very challenging to identify automatically due to the high context-dependence and the lack of a large-scale gold-standard data set. In this paper, we present a prototypical yet robust and diverse data set for media bias research. It consists of 1,700 statements representing various media bias instances and contains labels for media bias identification on the word and sentence level. In contrast to existing research, our data incorporate background information on the participants’ demographics, political ideology, and their opinion about media in general. Based on our data, we also present a way to detect bias-inducing words in news articles automatically. Our approach is feature-oriented, which provides a strong descriptive and explanatory power compared to deep learning techniques. We identify and engineer various linguistic, lexical, and syntactic features that can potentially be media bias indicators. Our resource collection is the most complete within the media bias research area to the best of our knowledge. We evaluate all of our features in various combinations and retrieve their possible importance both for future research and for the task in general. We also evaluate various possible Machine Learning approaches with all of our features. XGBoost, a decision tree implementation, yields the best results. Our approach achieves an F&lt;sub&gt;1&lt;/sub&gt;-score of 0.43, a precision of 0.29, a recall of 0.77, and a ROC AUC of 0.79, which outperforms current media bias detection methods based on features. We propose future improvements, discuss the perspectives of the feature-based approach and a combination of neural networks and deep learning with our current system.</dcterms:abstract>
    <dc:contributor>Rudnitckaia, Lada</dc:contributor>
    <dc:creator>Mitrović, Jelena</dc:creator>
    <dc:creator>Gipp, Bela</dc:creator>
    <dc:language>eng</dc:language>
    <dc:rights>Attribution-NonCommercial-NoDerivatives 4.0 International</dc:rights>
    <dcterms:available rdf:datatype="http://www.w3.org/2001/XMLSchema#dateTime">2021-02-24T13:40:37Z</dcterms:available>
    <dcterms:isPartOf rdf:resource="https://kops.uni-konstanz.de/server/rdf/resource/123456789/36"/>
    <dc:date rdf:datatype="http://www.w3.org/2001/XMLSchema#dateTime">2021-02-24T13:40:37Z</dc:date>
    <foaf:homepage rdf:resource="http://localhost:8080/"/>
    <dcterms:issued>2021-05</dcterms:issued>
    <dc:creator>Rudnitckaia, Lada</dc:creator>
  </rdf:Description>
</rdf:RDF>

Interner Vermerk

xmlui.Submission.submit.DescribeStep.inputForms.label.kops_note_fromSubmitter

Kontakt
URL der Originalveröffentl.

Prüfdatum der URL

Prüfungsdatum der Dissertation

Finanzierungsart

Kommentar zur Publikation

Allianzlizenz
Corresponding Authors der Uni Konstanz vorhanden
Internationale Co-Autor:innen
Universitätsbibliographie
Nein
Begutachtet
Unbekannt
Diese Publikation teilen