LocalDoc
/

language_detection

Text Classification

language detect

Model card Files Files and versions

vrashad commited on May 6, 2024

Commit

3e3c87e

·

verified ·

1 Parent(s): dd4e2f3

Update README.md

Files changed (1) hide show

README.md +31 -2

README.md CHANGED Viewed

@@ -42,11 +42,11 @@ pip install transformers
 ```
 ```python
-from transformers import AutoModelForSequenceClassification, AutoTokenizer
 import torch
 # Load tokenizer and model
-tokenizer = AutoTokenizer.from_pretrained("LocalDoc/language_detection")
 model = AutoModelForSequenceClassification.from_pretrained("LocalDoc/language_detection")
 # Prepare text
@@ -67,6 +67,35 @@ predicted_label = labels[predicted_class_index]
 print(f"Predicted Language: {predicted_label}")
 ```
 Training Performance

 ```
 ```python
+from transformers import AutoModelForSequenceClassification, XLMRobertaTokenizer
 import torch
 # Load tokenizer and model
+tokenizer = XLMRobertaTokenizer.from_pretrained("LocalDoc/language_detection")
 model = AutoModelForSequenceClassification.from_pretrained("LocalDoc/language_detection")
 # Prepare text
 print(f"Predicted Language: {predicted_label}")
 ```
+## Language Label Information
+The model outputs a label for each prediction, corresponding to one of the languages listed below. Each label is associated with a specific language code as detailed in the following table:
+| Label | Language Code | Language Name |
+|-------|---------------|---------------|
+| 0     | az            | Azerbaijani   |
+| 1     | ar            | Arabic        |
+| 2     | bg            | Bulgarian     |
+| 3     | de            | German        |
+| 4     | el            | Greek         |
+| 5     | en            | English       |
+| 6     | es            | Spanish       |
+| 7     | fr            | French        |
+| 8     | hi            | Hindi         |
+| 9     | it            | Italian       |
+| 10    | ja            | Japanese      |
+| 11    | nl            | Dutch         |
+| 12    | pl            | Polish        |
+| 13    | pt            | Portuguese    |
+| 14    | ru            | Russian       |
+| 15    | sw            | Swahili       |
+| 16    | th            | Thai          |
+| 17    | tr            | Turkish       |
+| 18    | ur            | Urdu          |
+| 19    | vi            | Vietnamese    |
+| 20    | zh            | Chinese       |
+This mapping is utilized to decode the model's predictions into understandable language names, facilitating the interpretation of results for further processing or analysis.
 Training Performance