Kasper Fyhn Borg

Position: Collective causal reasoning: extracting and modeling networks of causal relations in Danish and English
Categories: Fellows, PhD Fellows 2024
Location: Aarhus University

Abstract:

The early 2020s have been defined by the concurrent global crises of the pandemic and climate change, characterized by complex interplays of causes, effects, and (real and potential) interventions. Communication about these crises reflect rich causal and counterfactual reasoning over collectively negotiated models of the world. Presently, the argumentative structure of collective discourse can only be studied qualitatively, which imposes limits on the generalizability and scalability of research findings, largely because the task of Causal Relation Extraction (CRE) at scale is underdeveloped in NLP, and non-existent for low-resource languages like Danish.

The project leverages state-of-the-art large language models (LLMs) and few-shot prompt-based training to implement a ground-breaking computationally-assisted approach to CRE at scale: modeling collectively constructed causal models via causal linguistic reports in texts. It represents the first NLP implementation of causal modeling at scale and is developed with multilingual support for both English and Danish. By developing methods to automate the extraction of collective causal models from corpora and produce interpretable graphs of their underlying structure, we allow causal relations to be investigated empirically and at the scale of public discourse.

Causal language is a window into how humans reason causally and counterfactually, a capacity widely held to be the hallmark of human intelligence, and a key topic in research on science and crisis communication, mis/dis-information, and public trust and solidarity. Unlike many computational methodologies, our models and tools will be developed and fine-tuned through application to social scientific research questions. This integrated and research-guided approach ensures that model performance will be evaluated for explainability, interpretability, and robustness by domain-experts at every step. Our open source models and published results will have broad applicability for researchers across disciplines, as well as external stakeholders like policymakers and public health officials.