Polar@SemEval-2026
Task 9: Detecting Multilingual, Multicultural and Multievent Online Polarization

Attitude Polarization Detection in Multilingual, Multicultural and Multievent Contexts

What Is POLAR?

POLAR is a multilingual, multicultural, and interdisciplinary initiative focused on detecting and understanding attitude polarization in online discourse. From political debates and religious conflicts to ethnic tensions and gender bias, polarized language fuels misunderstanding, hate, and social fragmentation—especially in online spaces.

We created POLAR to address this growing problem by bringing together AI, linguistics, sociology, and ethics to develop responsible NLP systems that work across languages, events, and cultures.

Why POLAR Matters

Conventional NLP tasks often focus on sentiment or toxicity in a single language or event context. But real-world polarization is more complex:

It occurs in multiple languages simultaneously
It evolves with current events
It uses implicit framing, not just explicit hate
It targets identities, ideologies, and institutions
It varies across cultures and political systems

POLAR fills this gap by designing resources, models, and benchmarks that can capture these nuances, offering a more realistic view of global online discourse.

What Does POLAR Do?

Develops multilingual datasets annotated for polarization
Supports low-resource languages and cross-cultural narratives
Designs interpretable models that show how and why text is polarizing
Collaborates globally with researchers, annotators, and domain experts
Drives responsible AI for social good, policy impact, and academic insight

Who Is Behind POLAR?

POLAR is led by an interdisciplinary team of:

Computational linguists and NLP researchers
Sociologists and political scientists
Annotators and native speakers from diverse linguistic backgrounds
Collaborators and contributors from universities and institutions worldwide

Our shared commitment is to open-source, ethically informed, and impact-driven research that bridges the gap between language and society.

Motivation

In today’s digitally connected world, online polarization is rapidly rising—splitting communities, amplifying misinformation, and worsening conflicts. Social media and online discourse are filled with increasingly hostile language targeting political views, religions, ethnic groups, and even entire nations.

While NLP has made impressive strides in sentiment analysis and hate speech detection, polarization remains understudied, especially across languages, cultures, and real-world events.

What Do We Want to Achieve?

This task sets out to develop and evaluate NLP systems that can:

Detect polarization in diverse texts and languages
Understand who or what is being targeted—religions, political ideologies, races, or genders
Interpret how polarization is expressed—via framing, dehumanization, stereotyping, and more
Operate effectively across high-, mid-, and low-resource languages
Work on real-world events like elections, protests, and international conflicts
Provide interpretable outputs, not just predictions
Promote ethical, explainable AI in socially sensitive contexts

Long-Term Vision

By building robust, explainable models of online polarization, we hope to:

Enable early detection of toxic or polarizing discourse
Support moderation and peacebuilding efforts
Advance cross-linguistic and cross-cultural NLP
Facilitate interdisciplinary research at the intersection of language, society, and technology
Build datasets and benchmarks that can be reused and extended by the community

A Step Toward Responsible NLP

We believe NLP should not only understand what people say, but also why they say it and how it impacts society. This task contributes to that goal by creating:

Meaningful annotations
Interpretable models
Public datasets
Open discussions about bias, ethics, and fairness

Your Contribution Matters

Every team that participates helps push the boundaries of multilingual, socially-aware AI. Together, we can build systems that do more than classify—they can help explain, caution, and connect.

Join us in making NLP not just more powerful, but also more responsible.

Task Overview

Polarization refers to the division of opinions into two sharply contrasting groups, often accompanied by hostility, intolerance, or exclusion. In today's digital era, polarization is intensifying across platforms and geographies, influencing public discourse, exacerbating conflicts, and contributing to societal fragmentation.

This shared task is the first SemEval initiative focused on polarization, aiming to advance the computational understanding of how polarization manifests in text across multiple languages, cultures, and event types. Participants will develop models capable of detecting and interpreting polarization in a variety of online contexts.

The task centers on textual data collected from real-world events such as elections, international conflicts, social protests, and ideological debates. The primary goal is to evaluate systems’ ability to identify polarized content and classify its targets.

Multilingual, Multicultural, and Multievent Scope

To promote global inclusivity and cross-cultural representation, the dataset encompasses multiple languages, including many mid/low-resource and underrepresented ones in mainstream NLP research.

Task Format and Subtasks

Participants may choose to compete in one or more of three subtasks:

Subtask 1: Polarization Detection – Binary classification to determine whether a post contains polarized content (Polarized or Not Polarized).
Subtask 2: Polarization Type Classification – Identify the target of polarization, including political groups, religious groups, racial/ethnic communities, gender identities, sexual orientations, or other domain-specific targets.
Subtask 3: Manifestation Identification – Classify how polarization is expressed; multiple labels possible, such as stereotyping, vilification, dehumanization, deindividuation, extreme language, lack of empathy, invalidation.

Data Description

The dataset is sourced from news websites, Reddit, blogs, Bluesky, and regional forums, covering event types like elections, conflicts, gender rights, migration, and more. Each language includes between 3,000 and 5,000 annotated instances.

Annotation tools used include Label Studio, Prolific, Potato, and Mechanical Turk.

Research Contributions

This task aims to advance socially responsible AI by supporting NLP for low-resource languages and fostering explainable and inclusive NLP systems. It will help establish multilingual benchmarks for polarization detection, promoting fair and transparent computational approaches to understanding societal divisions.

Important Dates

Keep track of the key milestones for your participation in the Polar task.
All deadlines are 11:59 PM AoE (Anywhere on Earth).

📆 Date	📌 Event
31 March 2025	Call for Participation Opens / Task Proposals Due
8 August 2025	Trial Data Released
1 September 2025	Training Data Released
1 December 2025	Test Data Released (internal deadline; not for public release)
1 January 2026	Evaluation Results Released
10 January 2026	System Submission Deadline / Evaluation Start
31 January 2026	Evaluation Results Released / Evaluation End
1 February 2026	Evaluation End
February 2026	System Paper Submission Deadline
March 2026	Notification of Acceptance
April 2026	Camera-Ready Papers Due
Summer 2026	SemEval-2026 Workshop at [Conference Location TBD]

Starter Kit

🧪 Task Format and Subtasks

Participants may choose to compete in one or more of the following three subtasks:

Subtask 1: Polarization Detection

Binary classification: Identify whether a post contains polarized content.

Labels: Polarized, Not Polarized

Subtask 2: Polarization Type Classification

Classify the target of polarization.

Political groups or ideologies
Religious groups or beliefs
Racial or ethnic communities
Gender identities
Sexual orientations
Other/domain-specific targets

Subtask 3: Manifestation Identification

Classify how polarization is expressed. Multiple labels possible.

Stereotyping
Vilification
Dehumanization
Deindividuation
Use of Extreme Language
Lack of Empathy
Invalidation

📁 Data Description

Dataset sources: News websites, Reddit, blogs, Bluesky, regional forums. Event types include elections, conflicts, gender rights, migration, and more.

Each language has 3,000–5,000 annotated instances. Tools used: Label Studio, Prolific, Potato, Mechanical Turk.

🎯 Research Contributions

Advancing socially responsible AI
Supporting low-resource language NLP
Fostering explainable and inclusive NLP systems
Creating multilingual benchmarks for polarization detection

Why Participate?

🌟 Why Participate?

Top-performing teams will:

Be featured in the official SemEval 2026 proceedings
Present their approach at the SemEval workshop
Gain visibility in the NLP research community

All teams that submit system descriptions will be included in the task overview paper.

Stay Connected

🌐 Stay Connected

We’d love to hear from you! Whether you're a participant, researcher, or enthusiast, there are multiple ways to stay in the loop and engage with the POLAR shared task.

📧 Mailing List

Subscribe to our mailing list for key announcements, reminders, and dataset releases:
👉 Join the Mailing List

💬 Join the Community

GitHub Discussions
Ask questions, report issues, or suggest improvements.
🔗 Visit our GitHub Repository
Discord (Coming Soon)
We’ll launch a Slack workspace for real-time discussions and announcements. Stay tuned for the invite link!

📱 Social Media

Twitter/X: @polar_task2026
LinkedIn: POLAR Shared Task

📮 Contact the Organizers

Have a question or want to collaborate? Reach out to us directly:
Email: polarization-semeval-2026-organisers@googlegroups.com
We usually respond within 2–3 business days.

Polar@SemEval-2026 Task 9: Detecting Multilingual, Multicultural and Multievent Online Polarization