Subtasks
Our task consists of three subtasks. Participants may choose to compete in one or more of the three subtasks. Each subtask is designed to address a specific aspect of polarization detection and analysis in multilingual social media content.
Dataset Information
Data sources include news websites, Reddit, blogs, Bluesky, and regional forums, covering events such as elections, conflicts, gender rights, migration, and more. Each language contains 3,000–5,000 annotated instances.
A few sample instances are provided here for review, and additional trial data can be found at: TRIAL DATA
Languages Covered
The task covers 22 languages across different cultural and geographical contexts:
Amharic, Arabic, Bengali, Burmese, Chinese, English, German, Hausa, Hindi, Italian, Khmer, Nepali, Odia, Persian, Polish, Punjabi, Russian, Spanish, Swahili, Telugu, Turkish, Urdu.
Subtask 1: Polarization Detection
Binary classification to determine whether a post contains polarized content (Polarized or Not Polarized).
id | text | polarization |
---|---|---|
2745 | Find yourself a west bank settler gf | 1 |
2738 | Fascist oligarchs now control the USA | 1 |
3184 | Someone end this lunatic before he starts ethnic cleansing | 1 |
1614 | The EU is increasing military aid to Ukraine to one billion euros the press release on the website of the Council of Europe. | 0 |
716 | House drafts bill to strike Iran proxies amid IsraelHamas | 0 |
309 | Contested races across county early voting | 0 |
Subtask 2: Polarization Type Classification
Multi-label classification to identify the target of polarization as one of the following categories: Political, Racial/Ethnic, Religious, Gender/Sexual or Other.
id | text | political | racial/ethnic | religious | gender/sexual | other |
---|---|---|---|---|---|---|
2745 | Find yourself a west bank settler gf | 1 | 0 | 0 | 1 | 0 |
2738 | Fascist oligarchs now control the USA | 1 | 0 | 0 | 0 | 0 |
3184 | Someone end this lunatic before he starts ethnic cleansing | 1 | 1 | 0 | 0 | 0 |
Subtask 3: Manifestation Identification
Multi-label classification to classify how polarization is expressed, with multiple possible labels including Stereotype, Vilification, Dehumanization, Extreme Language, Lack of Empathy", or Invalidation.
NOTE: Italian and Russian languages are not included in this subtask.
id | text | stereotype | vilification | dehumanization | extreme_language | lack_of_empathy | invalidation |
---|---|---|---|---|---|---|---|
2745 | Find yourself a west bank settler gf | 1 | 1 | 0 | 0 | 0 | 0 |
2738 | Fascist oligarchs now control the USA | 0 | 1 | 0 | 1 | 0 | 0 |
3184 | Someone end this lunatic before he starts ethnic cleansing | 1 | 1 | 0 | 1 | 0 | 0 |