Uncategorized

New paper by TWON researcher Simon Münker: Fingerprinting LLMs through Survey Item Factor Correlation: A Case Study on Humor Style Questionnaire

We are proud to announce that our researcher Simon Münker published a new paper with the title: Fingerprinting LLMs through Survey Item Factor Correlation: A Case Study on Humor Style Questionnaire. It is published in the Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing and the results will be presented in Shanghai on 5 November.

LLMs increasingly engage with psychological instruments, yet how they represent constructs internally remains poorly understood. Simon Münker introduces a novel approach to “fingerprinting” LLMs through their factor correlation patterns on standardized psychological assessments to deepen the understanding of LLMs constructs representation. Using the Humor Style Questionnaire as a case study, he analyzes how six LLMs represent and correlate humor-related constructs to survey participants. His results show that they exhibit little similarity to human response patterns. In contrast, participants’ subsamples demonstrate remarkably high internal consistency. Exploratory graph analysis further confirms that no LLM successfully recovers the four constructs of the Humor Style Questionnaire. These findings suggest that despite advances in natural language capabilities, current LLMs represent psychological constructs in fundamentally different ways than humans, questioning the validity of application as human simulacra.

It’s a wrap: CitizenLab 2025 in Chemnitz

On 8 October, we hosted another CitizenLab in the Stadthallenpark in Chemnitz, where we got to speak with citizens about our research on Online Social Networks.

We presented our demonstrators MicroTWONY, MacroTWONY, and TWONderland to interested citizens and participants, had inspiring conversations about the impact of Online Social Networks on society and democracy, as well as possibilities for regulation and ethical design. We are glad to see how many participants enjoyed experimenting with the demonstrators and exploring how digital dynamics become tangible!

In the evening, we joined an interesting event on memory culture in digital spaces at the NSU Documentation Center with TWON researcher Jonas Fegert, journalist Nhi Le and Susanne Siegert from the channel @keineerinnerungskultur, moderated by Benjamin Fischer. The discussion focused on the opportunities social networks offer for democratic education, especially for younger audiences, and on the limitations imposed by platform mechanisms that tend to amplify hate speech and misinformation.

A day full of dialogue, reflection, and future perspectives – thank you for everybody who was a part of it, and we’re looking forward to the next CitizenLab!

New publication: Can we use automated approaches to measure the quality of online political discussion?

We’re proud to announce that our consortium members Sjoerd Stolwijk, Damian Trilling (both University of Amsterdam) and Simon Münker (Trier University) contributed to a freshly published paper on measuring the debate quality of online political discussions. The paper was released in the “Communication Methods and Measures” journal by Routledge and is open access.

Our researchers review how debate quality has been measured in communication science, and systematically compare 50 automated metrics against numerous manually coded comments. Based on their experiments, they were able to give clear recommendations for how to (not) measure debate quality in terms of interactivity, diversity, rationality, and (in)civility according to Habermas.

Their results show that transformer models and generative AI (like Llama and GPT-models) outperform older methods, yet there is variance and the success depends on the measured concept, as some (e.g. rationality) remain difficult to capture also by human coding. Which measure should be preferred for future empirical applications is likely dependent on the
objective of the study in question. For some genres, language and communication style (e.g. satire), it is strongly advised to test the accuracy of automated methods against the human interpretation beforehand, even if methods are widely used. Some approaches and implementations performed so poorly that they are not suitable for studying debate quality.

Zero-shot prompt-based classification @ACL Vienna

Simon Münker recently presented his research on the use of zero-shot, prompt-based classification for analysing political discourse on German Twitter during the European energy crisis at the 2025 Association for Computational Linguistics Conference in Vienna. He gave a poster presentation and a talk about his newly published paper.

In their paper, Dr. Achim Rettinger, Kai Kugler and Simon Münker assess advancements in NLP, specifically large foundation models, for automating annotation processes on German Twitter data concerning European crises.

The study explores how recent advances in large language models (LLMs) can reduce the need for long manual work when labeling and categorizing social media content. Instead of training models with thousands of examples, LLMs can follow written prompts to classify tweets in a zero-shot setting, meaning without prior training on the specific task.

The dataset used was collected from a German Twitter dataset based on survey questions from the SOSEC project about the energy crisis in winter 2022/23. Two domain experts and native speakers annotated a random sample of around 7,000 tweets.

The models that were evaluated included: a baseline Naive Bayes classifier using token counts; a fine-tuned German-specific BERT transformer (“gbert-base”)- a model further adapted with additional pretraining on domain-specific tweets to improve domain relevance; and instruction-tuned models based on T5, which follow prompts to classify texts without domain-specific fine-tuning using zero-shot prompting techniques.

The results show that prompt-based approaches perform almost as well as fine-tuned BERT models. The study therefore concludes that a prompt-based approach can achieve comparable performance to fine-tuned BERT without requiring annotated training data.

However, the study also emphasizes limitations such as the inherited and potentially amplified biases present in the training data and differences in outcomes related to the language used (German/English), as well as cultural nuances.

Automating the analysis of political and social debates raises questions about the role AI can and should play in interpreting sensitive public discourse.

Panel discussion: TWON researcher Jonas Fegert on “Who owns AI? On democratization, control and power relations”

On July 14th, TWON researcher Jonas Fegert (FZI Research Center for Information Technology), was invited as a panelist to the event “Who owns AI? On democratization, control and power relations” hosted by the House for Journalism and the Public Sphere in Berlin. The panel discussion explored how artificial intelligence can be shaped and governed democratically and what social, political and technological conditions are needed to make that possible.

At the heart of the discussion were fundamental questions about power structures in the field of AI. Today, artificial intelligence influences many areas of life, from work and education to everyday decision-making. Yet major developments in this space are often driven by large tech corporations without meaningful input from democratic institutions or the public. The panel reflected on what it could mean to democratize AI, who should have a say in its direction and what roles parliaments, research institutions and civil society can play in this process.

The event offered a valuable opportunity to engage with international experts from philosophy, social science and technology ethics. Many thanks to the organizers for the invitation and the insightful discussion.

New Publication: The Dual Impact of Virtual Reality: Examining the Addictive Potential and Therapeutic Applications of Immersive Media in the Metaverse

We are excited to share a new publication in Information, Communication & Society, titled “The Dual Impact of Virtual Reality: Examining the Addictive Potential and Therapeutic Applications of Immersive Media in the Metaverse” by Ljubiša Bojić, Jörg Matthes, Agariadne Dwinggo Samala, and Milan Čabarkapa.

As virtual reality (VR) technologies evolve rapidly and become important to the emerging metaverse, their influence on individuals and society is also growing. The study takes a closer look at how core features of VR, such as immersion, interactivity, real-time access, and personalization can have both harmful and helpful effects. By reviewing 44 peer-reviewed papers, they found that 19 studies identified these features as contributing to addictive behaviors, while 25 papers showed that the same features could be used to support addiction treatment, mental health care, and pain management.

This duality highlights the need to stop viewing VR as simply beneficial or harmful. Instead, it should be understood as a tool that can shape user behavior in different ways depending on design, context, and regulation. This work advances current understanding of VR by framing it as a media environment that closely mimics reality and deeply engages the senses making it both more compelling and potentially more risky than previous technologies. It also contributes to the growing discussion on how immersive technologies may change not only health and social interactions but also norms around communication, attention, and emotional well-being.

Based on these findings, it seems recommendable to advocate for stronger policy frameworks and design strategies to prevent overuse and media addiction. Suggestions include time-tracking tools, algorithmic diversity, and content moderation to avoid filter bubbles (communicative feedback loops). At the same time, the therapeutic potential of VR should be further developed in clinical settings where immersive environments can be safely and ethically used to support well-being.

📘 Access the full article here: https://doi.org/10.1080/1369118X.2025.2520005

Paper announcement: Does GPT-4 surpass human performance in linguistic pragmatics?

We are above excited to announce that our TWON colleague, Ljubisa Bojić of the University of Belgrade, has published an extensive study addressing a compelling question: Does GPT-4 surpass human performance in linguistic pragmatics? The paper explores whether large language models (LLMs) are capable of understanding nuanced, often implied meanings in human communication that go beyond the literal and depend on context, irony, sarcasm, or subtle conversational cues.

The study examined five LLMs (GPT-2, GPT-3, GPT-3.5, GPT-4, and Google’s Bard) alongside two groups of human participants: Serbian speakers of English as a second language and U.S. native English speakers. Each model and participant was asked to interpret a series of dialogue-based tasks specifically designed to test pragmatic understanding, drawing on Gricean communication principles such as relevance, clarity, and implicature. Their responses were evaluated using a standardized five-point scale, where a score of ‘1’ indicated poor or superficial understanding, and a ‘5’ signaled a deep and accurate interpretation of implied meaning, including the detection of sarcasm, irony, and other contextual subtleties.

The results weremore than interesting. GPT-4 not only outperformed all other AI models but also exceeded the performance of human participants. GPT-4 achieved an average score of 4.80, compared to the highest human score of 4.55. On average, human participants scored significantly lower (the Serbian group averaging 2.80 and the US group 2.34) while the LLMs, overall, averaged 3.39. GPT-4 ranked first among all 155 evaluated participants.

These findings carry important real-world implications. If AI can consistently interpret pragmatic cues better than humans, it could lead to more advanced and intuitive interactions between people and machines. For instance, this could dramatically enhance the capabilities of virtual assistants, customer service bots, and social robots, making them more adept at recognizing intent, tone, and emotion. Such improvements could prove especially valuable in fields like mental health, education, and conflict resolution, where reading between the lines is often crucial.

At the same time, these advances raise important ethical considerations. As we begin to rely more on AI for interpreting nuanced communication, there is a risk of misinterpretation or misuse, particularly in sensitive contexts. It also raises questions about accountability and the potential consequences of AI misunderstanding or manipulating human intent.

In short, while the ability of GPT-4 to surpass human performance in linguistic pragmatics marks a major milestone for AI, it also underscores the need for thoughtful, responsible integration of such technologies into society. The study offers a glimpse into the future of human–AI communication; one that is more natural, perceptive, and possibly more capable than we previously imagined.

Find the open-access paper here

Recap: The 1st Workshop on Semantic Generative Agents on the Web (SemGenAge 2025)

June 2nd, 2025 – Portorož, Slovenia | Part of ESWC 2025

The 1st Workshop on Semantic Generative Agents on the Web, held on June 2nd in Portorož, Slovenia, as part of the Extended Semantic Web Conference (ESWC 2025), marked a key milestone in disseminating the goals and findings of the TWON project to the academic community. The event brought together researchers from diverse disciplines to explore how Semantic Web technologies and Large Language Models (LLMs) can be combined to develop intelligent, interpretable, and communicative agents for the web.

Opening Keynote

The workshop opened with a keynote by Matthias Nickles (National University of Ireland, Galway), who presented a comprehensive overview of the history and recent advancements in generative agents within computer science, setting the stage for the diverse presentations to follow.

Paper Presentations

Jan Lorenz (Constructor University) kicked off the presentations with a talk on Filter Bubbles in an Agent-Based Model Where Agents Update Their Worldviews with LLMs. His work replaced abstract numerical opinion spaces with LLM-generated human-like statements to simulate opinion dynamics. The goal was to assess whether filter bubbles would still emerge in this more realistic setting and to examine the practical integration of LLMs into agent-based simulations.

Next, Martin Žust (Jožef Stefan Institute) presented a web-based negotiation agent designed to assist unskilled negotiators in real time. The agent transcribes dialogue, builds dynamic world models, and combines analytical reasoning with human-like intuition to offer context-aware negotiation support. This hybrid approach aims to enhance interpersonal outcomes through collaborative human-AI interaction.

Abdul Sittar (Jožef Stefan Institute) followed with an agent-based simulation of social media engagement during German elections. By incorporating past conversational history, motivational factors, and resource constraints, the model used fine-tuned AI to generate posts and replies, applying sentiment analysis, irony detection, and offensiveness classification. The findings highlighted how historical context shapes AI responses and how behavior shifts under different temporal constraints.

Afternoon Keynote and Talks

In the afternoon keynote, Denisa Reshef Kera (Bar-Ilan University) addressed philosophical perspectives on generative agents, focusing on bias, representation, and agency. She emphasized the role of generative agents in public policy and civic participation, highlighting their potential for enhancing digital society.

Ljubisa Bojic (University of Belgrade) presented an innovative AI-based recommender system designed to reduce echo chambers and polarization. His model incorporates emotional tone, content diversity, and political balance into the recommendation process, improving content exposure without sacrificing accuracy. The approach aligns with ethical AI principles, offering user autonomy through customizable preferences.

Denisa Reshef Kera returned with Avital Dotan to present AI Beyond Rules, Heuristics, and Dreams, introducing the concept of ergative-absolutive AI agents. Drawing on linguistic structures from languages like Basque, they proposed a new way of modeling agency in LLMs—treating them not just as predictors but as performative systems that enact grammar and interaction. Their two-step framework involves analyzing grammatical alignments and creating participatory simulations with diverse agent alignment patterns to encourage adaptive and accountable behavior.

Simon Münker (University of Trier) concluded the paper presentations with twony, a micro-simulation platform that models emotional contagion and discourse dynamics in online social networks. Using fine-tuned BERT models and LLMs to simulate politically engaged personas, twony visualizes emotional cascades in various feed algorithm scenarios—offering a powerful, open-source tool for explaining polarization and online behavior.

Closing Discussion

The workshop concluded with a fishbowl discussion featuring Achim Rettinger, Damian Trilling, Marko Grobelnik, Matthias Nickles, and Denisa Reshef Kera. The panel reflected on the interdisciplinary insights presented throughout the day and discussed future directions for generative agents in real-world applications.

Takeaways

SemGenAge 2025 fostered rich dialogue across fields including semantic web technologies, AI, computational social science, and digital media studies. Discussions emphasized the potential of generative agents in areas such as online discourse moderation, content recommendation, opinion shaping, and consumer behavior analysis.

The workshop’s insights will directly support TWON’s mission: combining empirical observations, simulation, and participatory methods to create evidence-based recommendations for improving social network regulation and enhancing digital citizenship.

For full program details, visit the official workshop page.

Fifth TWON Consortium Meeting in Portorož, Slovenia

From May 30 – June 1st, all nine TWON partner institutions gathered in Portorož, Slovenia for the fifth TWON Consortium Meeting. This in-person event offered a key opportunity to assess our progress, align goals, and prepare for the final year of the project.

The meeting began with the general assembly led by consortium leader Damian Trilling, where we reviewed project milestones, discussed ongoing challenges, and set priorities for the months ahead. A central objective was to optimize integration across TWON’s thematic and methodological strands, reinforcing the coherence of our collective efforts. We then focused on planning our large-scale simulations — from sharpening research questions to technical implementation. The day concluded with a consortium dinner.

On Sunday, we had a workshop on design features of a democracy-preserving online social network, as a step towards developing policy recommendations. Later on, we had a session for plannung the upcoming Citien Labs, where we discuss our research with citizens. The day concluded with a plenary wrap-up, and an early-career researcher event, where we had the chance to discuss and feedback our PhD projects with each other.

The Portorož meeting not only advanced TWON’s agenda but also reinforced collaboration at a critical stage of the project. With renewed momentum, the consortium is well-prepared for the final project year.

Thank you, Jozef Stefan Institut for hosting us in beautiful Slovenia!

TWON Citizen Lab #2 in Vienna: On Tackling Hate, Misinformation and Polarisation in the Age of AI and Tech-Oligarchs

In TWON we do not only want to translate our scientific results into actionable recommendations for decision-makers in politics and industry – we also want to foster digital citizenship and the public debate on the role Online Social Networks should play in our society. This is why, our CitizenLabs are an essential part of TWON!

The Citizen Labs are conducted by TWON consortium member “DialoguePerspectives. Discussing Religions and Worldviews e.V.” who trains young European leaders to become experts in a new, society-oriented interreligious-worldview dialogue. The program brings together participants from diverse communities and backgrounds, encompassing individuals with 19 different religions and beliefs across 25 European countries. Through their unique perspectives and expertise, they contribute to fostering understanding, cooperation, and a pluralistic, democratic, and cohesive Europe.

Following the first Citizen Lab in Karlsruhe in September 2024, the second Citizen Lab recently took part in Vienna from May 11-14th, 2025! 35 young leaders from various communities from all over Europe came together to discuss a digital and pluralistic European future. The program was a mixture of input and discussion from our TWON researchers, impulses from external researchers and civil society organizations, reflexional workshops and a public evening event – all centered around tackling hate, misinformation and polarization in the age of AI and tech-oligarchs.

Dr. Jonas Fegert opened up the discussion, introducing the group to the TWON project, the goals and the need. Prof. Dr. Damian Trilling (University of Amsterdam) opened up a critical conversation around the limits of current research on social media dynamics, challenging our assumptions about echo chambers, filter bubbles, and the spread of disinformation. His interactive talk underscored how intuitive beliefs often outpace empirical evidence — and invited us to think more deeply about what we can actually measure.

Prof. Dr. Achim Rettinger (Trier University) tackled the complex intersection of AI agents and online discourse. Can AI replace us in some communicative functions — and should it? His workshop addressed both the dangers and opportunities of algorithmic content curation, especially in shaping public opinion and emotional response.

We also had the chance, to present our demonstrators micro & macro TWONy to the public! Led by Simon Münker (Trier University) and Fabio Sartori (KIT), participants explored our tools with which we try to make our simulations with generative agents tangible. The hands-on experience allowed for nuanced discussions about how different ranking logics affect emotional dynamics.

FZI researcher Cosima Pfannschmidt  led a workshop on envisioning a democratic online social network of the future. What would such a platform look like? Who would own and govern such a platform? How should content moderation be organized? Which criteria would the ranking algorithm prioritize? While it is crucial to research negative effects of online social networks, it is equally important to develop actionable, democratic alternatives.

At our public evening event in the Vienna Co-Innovation Factory, we dove deeper into the topic of digital democracy. Moderated by Dr. Jonas Fegert (FZI), Prof. Dr. Achim Rettinger (Uni Trier & FZI), Benjamin Fischer (CeMAS), Judith Peterka (TWON Advisory Board), Natascha Strobl (Expert on Right-wing Extremism and the New Right) and Dr. Sebastian Heidebrecht (EIF – Centre for European Integration Research, Vienna University) discussed on “Digital Democracy and the Power of Platforms: Policy, AI, and Accountability”. In a second panel session, Alina Bricman (Director of EU Affairs at B’nai B’rith International), Rosa Jellinek (Activist, Social Media Expert, Keshet Deutschland e.V.), Selin Aydın (Programme Manager CLAIM – Alliance Against Islamo-phobia and Anti-Muslim Hate) and Stefania Manca (Institute of Educational Technology, Italian National Research Council) discussed on „Countering Hate and Information Manipulation – Strategies for a Safer Digital Sphere“.

Beyond this the Citizen Lab included fantastic inputs from external researchers and civil society organizations:

  • Algorithmic Amplification of Hate Speech and Misinformation with Dr. Ing. Even Kapros (Strategic Designer and Researcher on HCI, UX, and Ethics, CEO and founder of Endurae, Strategic Advisor with Project Arc)
  • Foreign Influence: Digital Manipulation by Authoritarian States and Paths to Platform Accountability with Julia Smirnova (Senior Researcher, CeMAS)
  • What’s Working, What’s Not: Recommender Systems and Platform Accountability (Dr. Julia Neidhardt, Head of the CD Lab for Recommender Systems, UNESCO Co-Chair for Digital Humanism, TU Vienna
  • Voices That Echo: Holocaust Memory, Digital Activism and Peacebuilding in the Social Media Age with Dr. Stefania Manca | Institute of Educational Technology, Italian National Research Council)
  • Social Media, AI, Disinformation, and Freedom of Speech with Nuriyatul Lailiyah | Assistant Professor Communication Department Faculty of Social and Political Sciences Diponegoro University Semarang Central Java
  • World Café Discussions on Local Realities in the Digital Sphere with IGGÖ – Islamische Glaubensgemeinschaft in Österreich, JöH – Jüdische österreichische Hochschüler:innen and SEEDS – Security Education by Empowering Democratic Strength

The Citizen Lab concluded with a call to action: How can we translate these insights into political change? To wrap up the rich discussions, we developed policy proposals aimed at regulating algorithmic systems, increasing transparency, and creating digital environments where diverse voices can thrive. With DialoguePerspectives’ participatory approach we ensure that our policy recommendations are relevant, comprehensible, linked to the ongoing public debate and reflect the lived experiences of the diverse European communities. Participants worked on an existing TWON policy brief, which was previously developed in an iterative process with input from TWON researchers and the previous CitizenLab. The Vienna Citizen Lab 2025 reminded us that digital spaces are not neutral — they are designed, and as such, can be redesigned. To build an inclusive, democratic future, we need to shape not only the rules of online discourse, but the very structures that host them.

We are deeply grateful for the Vienna Citizen Lab 2025, hosted by TWON-partner DialoguePerspectives. What an unforgettable gathering that brought together European leaders committed to shaping democratic digital spaces! A heartfelt thank you to DialoguePerspectives and all those who made this experience so meaningful. It was an honor to be part of a space where listening, questioning, and reimagining Europe is not only possible—but already happening.

Panel Discussion on Countering Hate and Information Manipulation
TWON researchers at the Citizen Lab (Damian Trilling, Fabio Sartori, Jonas Fegert, Kira Wisniewski, Cosima Pfannschmidt and Simon Münker (from left to right)