Hamborg, Felix

Lade...
Profilbild
E-Mail-Adresse
ORCID
Geburtsdatum
Forschungsvorhaben
Organisationseinheiten
Berufsbeschreibung
Nachname
Hamborg
Vorname
Felix
Name

Suchergebnisse Publikationen

Gerade angezeigt 1 - 10 von 32
Lade...
Vorschaubild
Veröffentlichung

Towards Automated Frame Analysis : Natural Language Processing Techniques to Reveal Media Bias in News Articles

2022, Hamborg, Felix

News articles serve as a highly relevant source for individuals to inform themselves on current topics and salient political issues. How the news covers an issue decisively affects public opinion and our collective decision-making. Albeit the news is meant to not only communicate "objective facts" but also to assess events and their implications, biased coverage can be problematic. Especially when news consumers are not aware of the often subtle yet powerful slants present in the news, or when coverage is systematically slanted to alter public opinion, media bias poses a severe problem to society. Empowering newsreaders to critically assess the news is an essential means to face the issues caused by media bias. On the one hand, non-technical means such as media literacy practices or analysis approaches devised in political science are highly effective. However, they often come with immense efforts, such as researching and contrasting relevant news articles. Ultimately, this effort can represent an insurmountable barrier for these manual techniques to be applied during daily news consumption. On the other hand, automated data analysis methods are available and could enable timely bias analysis. However, automated approaches largely neglect the sophisticated models and analysis approaches devised in decade-long bias research in the social sciences. Compared to them, the automated approaches often yield superficial or inconclusive results. To enable effective and efficient bias identification, the thesis at hand proposes an interdisciplinary approach to reveal biases in English news articles reporting on a given political event. Therefore, the approach identifies the coverage's different perspectives on the event. The approach's so-called person-oriented frames represent how articles portray the persons involved in the event. In contrast to prior automated approaches, the identified frames are meaningful and substantially present in the news coverage. In particular, this thesis makes the following research contributions. The thesis presents the first interdisciplinary literature review on approaches for analyzing media bias, thereby contrasting studies and models from the social sciences with automated approaches such as devised in computer science. A key finding is that research in either discipline could benefit from integrating the other's expertise and methods. To facilitate such interdisciplinary research, the thesis establishes a shared conceptual understanding by mapping the state of the art from the social sciences to a framework that automated approaches can target. To address the weaknesses of prior work, the thesis then proposes person-oriented framing analysis (PFA). The approach integrates methodology that has been applied in practice in social science research and identifies specific in-text means of narrowly defined bias forms. In contrast to prior automated approaches, which treat bias rather as a holistic, vague concept, PFA detects article groups representing meaningful frames. Such frames could previously be only identified through manual content analysis or expert knowledge on the analyzed topic. Afterward, the thesis proposes methods for the PFA approach, investigates their suitability concerning PFA, and evaluates their technical effectiveness. For example, the thesis introduces the first method to classify sentiment in news articles. The thesis also lays out essential preparatory work for other tasks. For example, a method is proposed that resolves highly event-specific coreferences, which may even be of contradictory meanings in other contexts, such as "freedom fighters" and "terrorists." To demonstrate the effectiveness of the PFA approach, a prototype system to reveal person-oriented framing in event coverage is presented and evaluated. The results of a user study (n=160) demonstrate the effectiveness of the interdisciplinary approach devised in this thesis. In the study's single-blind setting, the PFA approach is most effective in increasing respondents' bias-awareness. Moreover, the study results confirm the findings of the literature review. They suggest that prior bias identification and communication approaches identify biases that are technically significant but often are meaningfully irrelevant. In practical terms, prior work facilitates the visibility of potential biases, whereas the PFA approach identifies meaningful biases indeed present in the coverage. This thesis is motivated by my vision to mitigate media bias's severely adverse effects on societies. Outside the academic context, this vision entails that popular news aggregators and news apps will integrate effective approaches for bias identification, such as PFA, to help news readers critically assess news coverage in a practical, effortless way during daily news consumption.

Lade...
Vorschaubild
Veröffentlichung

Media Bias in German News Articles : A Combined Approach

2021-02-02, Spinde, Timo, Hamborg, Felix, Gipp, Bela

Slanted news coverage, also called media bias, can heavily influence how news consumers interpret and react to the news. Models to identify and describe biases have been proposed across various scientific fields, focusing mostly on English media. In this paper, we propose a method for analyzing media bias in German media. We test different natural language processing techniques and combinations thereof. Specifically, we combine an IDF-based component, a specially created bias lexicon, and a linguistic lexicon. We also flexibly extend our lexica by the usage of word embeddings. We evaluate the system and methods in a survey (N = 46), comparing the bias words our system detected to human annotations. So far, the best component combination results in an F1 score of 0.31 of words that were identified as biased by our system and our study participants. The low performance shows that the analysis of media bias is still a difficult task, but using fewer resources, we achieved the same performance on the same task than recent research on English. We summarize the next steps in improving the resources and the overall results.

Vorschaubild nicht verfügbar
Veröffentlichung

Newsalyze : Effective Communication of Person-Targeting Biases in News Articles

2021, Hamborg, Felix, Heinser, Kim, Zhukova, Anastasia, Donnay, Karsten, Gipp, Bela

Media bias and its extreme form, fake news, can decisively affect public opinion. Especially when reporting on policy issues, slanted news coverage may strongly influence societal decisions, e.g., in democratic elections. Our paper makes three contributions to address this issue. First, we present a system for bias identification, which combines state-of-the-art methods from natural language understanding. Second, we devise bias-sensitive visualizations to communicate bias in news articles to non-expert news consumers. Third, our main contribution is a large-scale user study that measures bias-awareness in a setting that approximates daily news consumption, e.g., we present respondents with a news overview and individual articles. We not only measure the visualizations' effect on respondents' bias-awareness, but we can also pinpoint the effects on individual components of the visualizations by employing a conjoint design. Our bias-sensitive overviews strongly and significantly increase bias-awareness in respondents. Our study further suggests that our content-driven identification method detects groups of similarly slanted news articles due to substantial biases present in individual news articles. In contrast, the reviewed prior work rather only facilitates the visibility of biases, e.g., by distinguishing left- and right-wing outlets.

Vorschaubild nicht verfügbar
Veröffentlichung

The POLUSA Dataset : 0.9M Political News Articles Balanced by Time and Outlet Popularity

2020-05-27T14:24:11Z, Gebhard, Lukas, Hamborg, Felix

News articles covering policy issues are an essential source of information in the social sciences and are also frequently used for other use cases, e.g., to train NLP language models. To derive meaningful insights from the analysis of news, large datasets are required that represent real-world distributions, e.g., with respect to the contained outlets' popularity, topically, or across time. Information on the political leanings of media publishers is often needed, e.g., to study differences in news reporting across the political spectrum, which is one of the prime use cases in the social sciences when studying media bias and related societal issues. Concerning these requirements, existing datasets have major flaws, resulting in redundant and cumbersome effort in the research community for dataset creation. To fill this gap, we present POLUSA, a dataset that represents the online media landscape as perceived by an average US news consumer. The dataset contains 0.9M articles covering policy topics published between Jan. 2017 and Aug. 2019 by 18 news outlets representing the political spectrum. Each outlet is labeled by its political leaning, which we derive using a systematic aggregation of eight data sources. The news dataset is balanced with respect to publication date and outlet popularity. POLUSA enables studying a variety of subjects, e.g., media effects and political partisanship. Due to its size, the dataset allows to utilize data-intense deep learning methods.

Vorschaubild nicht verfügbar
Veröffentlichung

XCoref: Cross-document Coreference Resolution in the Wild

2022, Zhukova, Anastasia, Hamborg, Felix, Donnay, Karsten, Gipp, Bela

Datasets and methods for cross-document coreference resolution (CDCR) focus on events or entities with strict coreference relations. They lack, however, annotating and resolving coreference mentions with more abstract or loose relations that may occur when news articles report about controversial and polarized events. Bridging and loose coreference relations trigger associations that may expose news readers to bias by word choice and labeling. For example, coreferential mentions of “direct talks between U.S. President Donald Trump and Kim” such as “an extraordinary meeting following months of heated rhetoric” or “great chance to solve a world problem” form a more positive perception of this event. A step towards bringing awareness of bias by word choice and labeling is the reliable resolution of coreferences with high lexical diversity. We propose an unsupervised method named XCoref, which is a CDCR method that capably resolves not only previously prevalent entities, such as persons, e.g., “Donald Trump,” but also abstractly defined concepts, such as groups of persons, “caravan of immigrants,” events and actions, e.g., “marching to the U.S. border.” In an extensive evaluation, we compare the proposed XCoref to a state-of-the-art CDCR method and a previous method TCA that resolves such complex coreference relations and find that XCoref outperforms these methods. Outperforming an established CDCR model shows that the new CDCR models need to be evaluated on semantically complex mentions with more loose coreference relations to indicate their applicability of models to resolve mentions in the “wild” of political news articles.

Vorschaubild nicht verfügbar
Veröffentlichung

Do You Think It's Biased? : How To Ask For The Perception Of Media Bias

2021, Spinde, Timo, Kreuter, Christina, Gaissmaier, Wolfgang, Hamborg, Felix, Gipp, Bela, Giese, Helge

Media coverage possesses a substantial effect on the public perception of events. The way media frames events can significantly alter the beliefs and perceptions of our society. Nevertheless, nearly all media outlets are known to report news in a biased way. While such bias can be introduced by altering the word choice or omitting information, the perception of bias also varies largely depending on a reader's personal background. Therefore, media bias is a very complex construct to identify and analyze. Even though media bias has been the subject of many studies, previous assessment strategies are oversimplified, lack overlap and empirical evaluation. Thus, this study aims to develop a scale that can be used as a reliable standard to evaluate article bias. To name an example: Intending to measure bias in a news article, should we ask, “How biased is the article?” or should we instead ask, “How did the article treat the American president?”. We conducted a literature search to find 824 relevant questions about text perception in previous research on the topic. In a multi-iterative process, we summarized and condensed these questions semantically to conclude a complete and representative set of possible question types about bias. The final set consisted of 25 questions with varying answering formats, 17 questions using semantic differentials, and six ratings of feelings. We tested each of the questions on 190 articles with overall 663 participants to identify how well the questions measure an article's perceived bias. Our results show that 21 final items are suitable and reliable for measuring the perception of media bias. We publish the final set of questions on http://bias-guestion-tree.gipplab.org/.

Vorschaubild nicht verfügbar
Veröffentlichung

ANEA: Automated (Named) Entity Annotation for German Domain-Specific Texts

2021, Zhukova, Anastasia, Hamborg, Felix, Gipp, Bela

Named entity recognition (NER) is an important task that aims to resolve universal categories of named entities, e.g., persons, locations, organizations, and times. Despite its common and viable use in many use cases, NER is barely applicable in domains where general categories are suboptimal, such as engineering or medicine. To facilitate NER of domain-specific types, we propose ANEA, an automated (named) entity annotator to assist human annotators in creating domain-specific NER corpora for German text collections when given a set of domain-specific texts. In our evaluation, we find that ANEA automatically identifies terms that best represent the texts’ content, identifies groups of coherent terms, and extracts and assigns descriptive labels to these groups, i.e., annotates text datasets into the domain (named) entities.

Lade...
Vorschaubild
Veröffentlichung

Automated identification of bias inducing words in news articles using linguistic and context-oriented features

2021-05, Spinde, Timo, Rudnitckaia, Lada, Mitrović, Jelena, Hamborg, Felix, Granitzer, Michael, Gipp, Bela, Donnay, Karsten

Media has a substantial impact on public perception of events, and, accordingly, the way media presents events can potentially alter the beliefs and views of the public. One of the ways in which bias in news articles can be introduced is by altering word choice. Such a form of bias is very challenging to identify automatically due to the high context-dependence and the lack of a large-scale gold-standard data set. In this paper, we present a prototypical yet robust and diverse data set for media bias research. It consists of 1,700 statements representing various media bias instances and contains labels for media bias identification on the word and sentence level. In contrast to existing research, our data incorporate background information on the participants’ demographics, political ideology, and their opinion about media in general. Based on our data, we also present a way to detect bias-inducing words in news articles automatically. Our approach is feature-oriented, which provides a strong descriptive and explanatory power compared to deep learning techniques. We identify and engineer various linguistic, lexical, and syntactic features that can potentially be media bias indicators. Our resource collection is the most complete within the media bias research area to the best of our knowledge. We evaluate all of our features in various combinations and retrieve their possible importance both for future research and for the task in general. We also evaluate various possible Machine Learning approaches with all of our features. XGBoost, a decision tree implementation, yields the best results. Our approach achieves an F1-score of 0.43, a precision of 0.29, a recall of 0.77, and a ROC AUC of 0.79, which outperforms current media bias detection methods based on features. We propose future improvements, discuss the perspectives of the feature-based approach and a combination of neural networks and deep learning with our current system.

Vorschaubild nicht verfügbar
Veröffentlichung

NewsMTSC : A Dataset for (Multi-)Target-dependent Sentiment Classification in Political News Articles

2021, Hamborg, Felix, Donnay, Karsten

Previous research on target-dependent sentiment classification (TSC) has mostly focused on reviews, social media, and other domains where authors tend to express sentiment explicitly. In this paper, we investigate TSC in news articles, a much less researched TSC domain despite the importance of news as an essential information source in individual and societal decision making. We introduce NewsMTSC, a high-quality dataset for TSC on news articles with key differences compared to established TSC datasets, including, for example, different means to express sentiment, longer texts, and a second test-set to measure the influence of multi-target sentences. We also propose a model that uses a BiGRU to interact with multiple embeddings, e.g., from a language model and external knowledge sources. The proposed model improves the performance of the prior state-of-the-art from F1_m=81.7 to 83.1 (real-world sentiment distribution) and from F1_m=81.2 to 82.5 (multi-target sentences).

Lade...
Vorschaubild
Veröffentlichung

Bias-aware news analysis using matrix-based news aggregation

2020-06, Hamborg, Felix, Meuschke, Norman, Gipp, Bela

Media bias describes differences in the content or presentation of news. It is an ubiquitous phenomenon in news coverage that can have severely negative effects on individuals and society. Identifying media bias is a challenging problem, for which current information systems offer little support. News aggregators are the most important class of systems to support users in coping with the large amount of news that is published nowadays. These systems focus on identifying and presenting important, common information in news articles, but do not reveal different perspectives on the same topic. Due to this analysis approach, current news aggregators cannot effectively reveal media bias. To address this problem, we present matrix-based news aggregation, a novel approach for news exploration that helps users gain a broad and diverse news understanding by presenting various perspectives on the same news topic. Additionally, we present NewsBird, an open-source news aggregator that implements matrix-based news aggregation for international news topics. The results of a user study showed that NewsBird more effectively broadens the user’s news understanding than the list-based visualization approach employed by established news aggregators, while achieving comparable effectiveness and efficiency for the two main use cases of news consumption: getting an overview of and finding details on current news topics.