Measuring Persuasive Language in Chatbots

The development of chatbots and large language models – so-called LLMs (Large Language Models) – has progressed rapidly in recent years. Models such as ChatGPT are now widely used, both in academia, in the media, and by the general public. This development places new demands on our understanding of how these models generate text and what linguistic properties that text exhibits.

This understanding has been the focus of Amalie Brogaard Pauli’s work over the past three years in her PhD at Aarhus University, funded by Danish Data Science Academy. Her research focuses on how persuasive language in text can be measured in a methodologically sound and reliable way, including in text generated by large language models.

“LLMs have made it possible to generate text in highly nuanced styles. This makes it relevant to ask research questions about how we analyse style properties in text, such as persuasive language – and how we do so in a methodologically valid way,” says Amalie Brogaard Pauli, who is affiliated with the Department of Computer Science at Aarhus University.

Language use and misinformation

The interest in this topic arises from a broader societal focus on misinformation and propaganda, which are often highlighted as threats to people’s judgement and decision-making – for example in the domains of health, politics, and news consumption. The COVID-19 pandemic clearly demonstrated how the spread of misinformation can have societal consequences.

With an academic background in mathematics-economics and data science from Aarhus University and the University of Edinburgh, respectively, and experience working with applied AI at the Alexandra Institute, Amalie Brogaard Pauli has approached the issue with a strong, methodological perspective.

“It is not only the content of information that matters, but also the language used to convey it. For that reason, it is relevant to analyse language and style features when we talk about persuasion as a phenomenon,” she says.

In her PhD dissertation, she therefore treats persuasive language as a stylistic, data-driven phenomenon that can be analysed across domains and communicative intentions – without assuming that the language necessarily has a specific effect on the reader.

We are skilled at evaluating persuasive language

Methodologically, Amalie Brogaard Pauli works with comparative evaluations. Rather than asking people to assign an absolute score to how persuasive a text is, texts with similar content are compared, and the assessment concerns which text appears more persuasive.

“Persuasive language is a nuanced and subjective concept. It is difficult to place it on a fixed scale. However, people are relatively good at comparing two texts and judging which one contains more persuasive language,” she explains.

These human judgements are used as training data for models that can analyse differences in persuasive language. The methods make it possible to investigate linguistic patterns across text types and domains.

Chatbots may reflect gender stereotypes

The methods are also applied to analyse text generated by large language models. Here, Amalie Brogaard Pauli examines, among other things, how the degree and type of persuasive language vary depending on model choice, instructions, and context. For example, her research indicates that LLMs can systematically generate gender-stereotypical linguistic patterns when, under specific experimental setups, they are instructed to formulate persuasive messages and arguments directed at women and men.

“I study how the language produced by the models can be described and measured – not the actual effect of persuasion in terms of influencing people’s actions or beliefs,” she says.

“In the longer term, however, a relevant research question may be whether such analyses of the language use can serve as a data basis for studies that investigate the effects of persuasion.”

Amalie’s PhD dissertation consists of five scientific articles and contributes new methods for evaluating stylistic and subjective aspects of language in natural language processing, NLP. At the same time, her research points to a growing need for more nuanced and methodologically well-founded evaluation frameworks as language models take on an increasingly prominent role in society.

Measuring when Chatbots use persuading language

Language use and misinformation

We are skilled at evaluating persuasive language

Chatbots may reflect gender stereotypes

Contact:

Danish Data Science Academy

DDSA is funded by: