File size: 5,614 Bytes
fbdafda
 
 
 
 
 
 
 
 
 
 
 
fb722e8
963cb02
 
72096f6
 
 
 
 
 
963cb02
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
6fe86f3
963cb02
 
6fe86f3
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
963cb02
6fe86f3
963cb02
6fe86f3
 
963cb02
6fe86f3
963cb02
6fe86f3
963cb02
6fe86f3
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
---
title: AIxcellent Vibes at GermEval 2025 Shared Task on Candy Speech Detection
emoji: 🍭
colorFrom: yellow
colorTo: pink
sdk: gradio
python_version: "3.12.9"
sdk_version: "5.35.0"
app_file: app.py
pinned: true
---

# AIxcellent Vibes at GermEval 2025 Shared Task on Candy Speech Detection 🍭

## Results
| Subtask | Submission | Model              | (strict) F1 Score | |
|---------|------------|--------------------|------------------:|-|
|       1 |          1 | Qwen3-Embedding-8B |             0.875 | [Notebook](https://github.com/dslaborg/germeval2025/blob/main/subtask_1/submission_subtask1.ipynb) |
|       1 |          2 | XLM-RoBERTa-Large  |             0.891 | [Notebook](https://github.com/dslaborg/germeval2025/blob/main/subtask_1/submission_subtask1-2.ipynb) |
|       2 |          1 | GBERT-Large        |             0.623 | [Notebook](https://github.com/dslaborg/germeval2025/blob/main/subtask_2/submission_subtask2.ipynb) |
|       2 |          2 | XLM-RoBERTa-Large  |             0.631 | [Notebook](https://github.com/dslaborg/germeval2025/blob/main/subtask_2/submission_subtask2-2.ipynb) |


## Setup 

```bash
python_version="$(cat .python-version)"

# install the interpreter if it’s missing
pyenv install -s "${python_version}"

# select python version for current shell
pyenv shell "${python_version}"

# create venv if missing
if [[ ! -d venv ]]; then
  python -m venv venv
fi

# activate venv & install packages
source venv/bin/activate

pip install -U pip setuptools wheel
pip install -r requirements.txt
``` 



---


# :trophy: Model

Model on [Huggingface](https://huggingface.co/cortex359/germeval2025)

## Model Details

- **Model Type:** Transformer-based encoder (XLM-RoBERTa-Large)
- **Developed by:** Christian Rene Thelen, Patrick Gustav Blaneck, Tobias Bornheim, Niklas Grieger, Stephan Bialonski (FH Aachen, RWTH Aachen, ORDIX AG, Utrecht University)
- **Paper:** [AIxcellent Vibes at GermEval 2025 Shared Task on Candy Speech Detection: Improving Model Performance by Span-Level Training](https://arxiv.org/abs/2509.07459v2)
- **Base Model:** [XLM-RoBERTa-Large](https://huggingface.co/FacebookAI/xlm-roberta-large) (Conneau et al., 2020)
- **Fine-tuning Objective:** Detection of *candy speech* (positive/supportive language) in German YouTube comments.

## Model Description

This model is a fine-tuned **XLM-RoBERTa-Large** adapted for the **GermEval 2025 Shared Task on Candy Speech Detection**.
It was trained to identify *candy speech* at both:

- **Binary level:** Classify whether a comment contains candy speech.
- **Span level:** Detect the exact spans and categories of candy speech within comments, using a BIO tagging scheme across **10 categories** (positive feedback, compliment, affection declaration, encouragement, gratitude, agreement, ambiguous, implicit, group membership, sympathy).

The span-level model also proved effective for binary detection by classifying a comment as candy speech if at least one positive span was detected.

## Intended Uses

- **Research:** Analysis of positive/supportive communication in German social media.
- **Applications:** Social media analytics, conversational AI safety (mitigating sycophancy), computational social science.
- **Not for:** Deployments without fairness/robustness testing on out-of-domain data.

## Performance

- **Dataset:** 46k German YouTube comments, annotated with candy speech spans.
- **Training Data Split:** 37,057 comments (train), 9,229 (test).
- **Shared Task Results:**

  - **Subtask 1 (binary detection):** Positive F1 = **0.891** (ranked 1st)
  - **Subtask 2 (span detection):** Strict F1 = **0.631** (ranked 1st)

## Training Procedure

- **Architecture:** XLM-RoBERTa-Large + linear classification layer (BIO tagging, 21 labels including “O”).
- **Optimizer:** AdamW
- **Learning Rate:** Peak 2e-5 with linear decay and warmup (500 steps).
- **Epochs:** 20 (with early stopping).
- **Batch Size:** 32
- **Regularization:** Dropout (0.1), weight decay (0.01), gradient clipping (L2 norm 1.0).
- **Postprocessing:** BIO tag correction and subword alignment.

## Limitations

- **Domain Specificity:** Trained only on German YouTube comments; performance may degrade on other platforms, genres, or languages.
- **Overlapping Spans:** Cannot handle overlapping spans, as they were rare (<2%) in the training data.
- **Biases:** May reflect biases present in the dataset (e.g., demographic skews in YouTube communities).
- **Generalization:** Needs evaluation before deployment in real-world moderation systems.

## Ethical Considerations

- **Positive speech detection** is less studied than toxic speech, but automatic labeling of “supportiveness” may reinforce cultural biases about what counts as “positive.”
- Must be complemented with **human-in-the-loop moderation** to avoid misuse.

## Citation

If you use this model, please cite:

```
@inproceedings{thelen-etal-2025-aixcellent,
    title = "{AI}xcellent Vibes at {G}erm{E}val 2025 Shared Task on Candy Speech Detection: Improving Model Performance by Span-Level Training",
    author = "Thelen, Christian Rene  and
      Blaneck, Patrick Gustav  and
      Bornheim, Tobias  and
      Grieger, Niklas  and
      Bialonski, Stephan",
    editor = "Wartena, Christian  and
      Heid, Ulrich",
    booktitle = "Proceedings of the 21st Conference on Natural Language Processing (KONVENS 2025): Workshops",
    month = sep,
    year = "2025",
    address = "Hannover, Germany",
    publisher = "HsH Applied Academics",
    url = "https://aclanthology.org/2025.konvens-2.33/",
    pages = "398--403"
}
```