Addressing Systematic Non-response Bias with Supervised Fine-Tuning of Large Language Models: A Case Study on German Voting Behaviour

Authors: Tobias Holtdirk, Dennis Assenmacher, Arnim Bleier, Claudia Wagner

This repository contains the code for the paper.

Abstract

A major challenge for survey researchers is dealing with missing data, which restricts the scope of analysis and the reliability of inferences that can be drawn. Recently, researchers have started investigating the potential of Large Language Models (LLMs) to role-play a pre-defined set of "characters" and simulate their survey responses with little or no additional training data and costs. Previous research has mostly focused on zero-shot LLM predictions. However, often other survey responses are at least partially available. This work investigates the viability and robustness of supervised fine-tuning on these responses to simulate systematic and random item-level non-responses in the context of German voting behaviour. Our results show when systematic item non-responses are present, fine-tuned LLMs outperform traditional classification approaches on survey data. Fine-tuned LLMs also seem to be more robust to changes in the set of features that the model can use to make predictions. Finally, we see that fine-tuned LLMs match the performance of traditional classification methods when survey responses are missing completely at random.

Setup

To setup the environment, install the uv package manager and run:

uv sync

Download the GLES 2017 dataset from GESIS (https://doi.org/10.4232/1.13648) and put it in the datasets folder.

Experiments

Run to reproduce the experiments from the paper:

python evaluation/rq1_gles2017_vote_cls.py
python evaluation/rq1_gles2017_vote_cls_no_party_id.py
python evaluation/rq2_gles2017_vote_cls_uni_and_school.py
python evaluation/rq2_gles2017_vote_cls_uni_and_school_no_id.py
python evaluation/rq2_gles2017_vote_cls_party_id_exclusion.py

(add a wandb user and project id to the evaluation scripts for logging)

Funding

This work received funding from the German Research Foundation (DFG) under project no. 504226141.

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
datasets		datasets
evaluation		evaluation
src/llm_survey_prediction		src/llm_survey_prediction
README.md		README.md
pyproject.toml		pyproject.toml
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Addressing Systematic Non-response Bias with Supervised Fine-Tuning of Large Language Models: A Case Study on German Voting Behaviour

Abstract

Setup

Experiments

Funding

About

Uh oh!

Releases

Packages

Uh oh!

Languages

tobihol/llm-gles-finetuning

Folders and files

Latest commit

History

Repository files navigation

Addressing Systematic Non-response Bias with Supervised Fine-Tuning of Large Language Models: A Case Study on German Voting Behaviour

Abstract

Setup

Experiments

Funding

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages