Skip to content

This repository contains the code, dataset, and resources for our ACL 2025 paper: "Profiling News Media for Factuality and Bias Using LLMs and the Fact-Checking Methodology of Human Experts"

License

Notifications You must be signed in to change notification settings

mbzuai-nlp/llm-media-profiling

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Profiling News Media for Factuality and Bias Using LLMs and the Fact-Checking Methodology of Human Experts

Zain Muhammad Mujahid, Dilshod Azizov, Maha Tufail Agro, Preslav Nakov

Code License Data License

📰 Abstract

In an age characterized by the proliferation of mis- and disinformation online, it is critical to empower readers to understand the content they are reading. Important efforts in this direction rely on manual or automatic fact-checking, which can be challenging for emerging claims with limited information. Such scenarios can be handled by assessing the reliability and the political bias of the source of the claim, i.e., characterizing entire news outlets rather than individual claims or articles. This is an important but understudied research direction. While prior work has looked into linguistic and social contexts, we do not analyze individual articles or information in social media. Instead, we propose a novel methodology that mimics the criteria that professional fact-checkers use to assess the factuality and political bias of an entire outlet. Specifically, we design a variety of prompts based on these criteria, and we elicit responses from large language models (LLMs), which we aggregate to make predictions. In addition to demonstrating sizable improvements over strong baselines via extensive experiments with multiple LLMs, we provide an in-depth error analysis of the effect of media popularity and region on model performance. We further conduct an ablation study to highlight the key components of our dataset that contribute to these improvements.

GitHub, Paper

Title

📊 Methodology

We adopt two main prompt strategies to elicit outlet-level insights from LLMs:

Handcrafted Prompts

  • 18 manually curated prompts across:
    • Stance on public figures/topics
    • Stancw on current popular issues
    • Trustworthiness & factuality

Systematic Prompts Based on Expert Criteria

  • We mimic the methodology employed by fact-checking journalists across 16 policy areas.
  • The LLM is asked to provide left/right leaning and reasoning for each.

These LLM responses are concatenated and passed to text classification models. We also present two case studies where we obtain zero-shot predictions from the LLMs by providing the media name and some of its recently published articles.

🗃️ Dataset

  • 4,192 media outlets annotated for factuality of reporting
  • 3,649 outlets annotated for political bias
  • Factuality: low, mixed, high
  • Bias: left, left-center, center, right-center, right

📈 Results

Factuality Prediction

Evaluation results for the experiments on our dataset for predicting the factuality of the reporting of the news media, grouped by different modeling methodologies.

Title

Political Bias Prediction

Evaluation results for the political bias prediction task (3 and 5 class), grouped according to the different modeling methods used. Each model marked with the symbol $\dagger$ was trained on data from the systematic prompts.

Title

Title

Impact of Media Popularity

Best model performance vs. media outlet popularity. (a) Political bias labels, and (b) Factuality labels plotted against Alexa Rank (log scale). Each point represents a media outlet with its original label. Green markers indicate correct predictions, and red markers indicate errors. A lower Alexa Rank means a more popular medium.

Title

Correct vs. incorrect predictions for U.S. and non-U.S. media outlets, highlighting higher accuracy for U.S.-based outlets.

Title

📌 Citation

Please cite us if you use our data or methodology.

@misc{mujahid2025profilingnewsmediafactuality,
      title={Profiling News Media for Factuality and Bias Using LLMs and the Fact-Checking Methodology of Human Experts}, 
      author={Zain Muhammad Mujahid and Dilshod Azizov and Maha Tufail Agro and Preslav Nakov},
      year={2025},
      eprint={2506.12552},
      archivePrefix={arXiv},
      primaryClass={cs.CL},
      url={https://arxiv.org/abs/2506.12552}, 
}

About

This repository contains the code, dataset, and resources for our ACL 2025 paper: "Profiling News Media for Factuality and Bias Using LLMs and the Fact-Checking Methodology of Human Experts"

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published