Update README.md
Browse files
README.md
CHANGED
|
@@ -1,3 +1,57 @@
|
|
| 1 |
-
---
|
| 2 |
-
license: apache-2.0
|
| 3 |
-
---
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
---
|
| 2 |
+
license: apache-2.0
|
| 3 |
+
---
|
| 4 |
+
# π Multilingual Intent Classifier β Language Switching
|
| 5 |
+
|
| 6 |
+
This model is a fine-tuned multilingual BERT (`bert-base-multilingual-cased`) for **intent classification** of language-switching requests.
|
| 7 |
+
It recognizes when a user wants to change the conversation language and supports 6 language:
|
| 8 |
+
|
| 9 |
+
- `english`
|
| 10 |
+
- `italian`
|
| 11 |
+
- `german`
|
| 12 |
+
- `spanish`
|
| 13 |
+
- `french`
|
| 14 |
+
- `other` (generic sentences not related to language switching)
|
| 15 |
+
- `not_allowed` (unsupported languages)
|
| 16 |
+
|
| 17 |
+
---
|
| 18 |
+
|
| 19 |
+
## π Training Data
|
| 20 |
+
|
| 21 |
+
- ~6,000 training examples
|
| 22 |
+
- Short conversational sentences (e.g. "Can we switch to English?", "Vorrei parlare in italiano", "Nein, bitte auf Deutsch")
|
| 23 |
+
- Languages covered: English, Italian, German, Spanish, French
|
| 24 |
+
- `not_allowed` and `other` provide robustness for real-world inputs
|
| 25 |
+
- 2/3 conversation steps
|
| 26 |
+
|
| 27 |
+
---
|
| 28 |
+
|
| 29 |
+
## π Usage with π€ Transformers
|
| 30 |
+
|
| 31 |
+
You can use the model directly with the `pipeline` API:
|
| 32 |
+
|
| 33 |
+
```python
|
| 34 |
+
from transformers import pipeline
|
| 35 |
+
|
| 36 |
+
# Replace with the actual model repo
|
| 37 |
+
model_name = "software-si/change-language-intent"
|
| 38 |
+
|
| 39 |
+
classifier = pipeline(
|
| 40 |
+
task="text-classification",
|
| 41 |
+
model=model_name,
|
| 42 |
+
tokenizer=model_name,
|
| 43 |
+
return_all_scores=True
|
| 44 |
+
)
|
| 45 |
+
|
| 46 |
+
texts = [
|
| 47 |
+
"Vorrei parlare in italiano",
|
| 48 |
+
"Can we switch to English?",
|
| 49 |
+
"Nein, bitte auf Deutsch"
|
| 50 |
+
]
|
| 51 |
+
|
| 52 |
+
results = classifier(texts)
|
| 53 |
+
|
| 54 |
+
for text, res in zip(texts, results):
|
| 55 |
+
print(f"\nInput: {text}")
|
| 56 |
+
for r in res:
|
| 57 |
+
print(f" {r['label']}: {r['score']:.4f}")
|