radoslavralev commited on
Commit
913ca02
·
verified ·
1 Parent(s): 990e9bb

Add new SentenceTransformer model

Browse files
README.md CHANGED
@@ -13,7 +13,7 @@ tags:
13
  - reranking
14
  - generated_from_trainer
15
  - loss:ArcFaceInBatchLoss
16
- base_model: thenlper/gte-small
17
  pipeline_tag: sentence-similarity
18
  library_name: sentence-transformers
19
  metrics:
@@ -36,41 +36,41 @@ model-index:
36
  type: test
37
  metrics:
38
  - type: cosine_accuracy@1
39
- value: 0.548650317572336
40
  name: Cosine Accuracy@1
41
  - type: cosine_precision@1
42
- value: 0.548650317572336
43
  name: Cosine Precision@1
44
  - type: cosine_recall@1
45
- value: 0.529780177773297
46
  name: Cosine Recall@1
47
  - type: cosine_ndcg@10
48
- value: 0.7467559051152127
49
  name: Cosine Ndcg@10
50
  - type: cosine_mrr@1
51
- value: 0.548650317572336
52
  name: Cosine Mrr@1
53
  - type: cosine_map@100
54
- value: 0.691192638604471
55
  name: Cosine Map@100
56
  - type: cosine_auc_precision_cache_hit_ratio
57
- value: 0.31983377806645374
58
  name: Cosine Auc Precision Cache Hit Ratio
59
  - type: cosine_auc_similarity_distribution
60
- value: 0.15293509382911363
61
  name: Cosine Auc Similarity Distribution
62
  ---
63
 
64
  # Redis fine-tuned BiEncoder model for semantic caching on LangCache
65
 
66
- This is a [sentence-transformers](https://www.SBERT.net) model finetuned from [thenlper/gte-small](https://huggingface.co/thenlper/gte-small). It maps sentences & paragraphs to a 384-dimensional dense vector space and can be used for sentence pair similarity.
67
 
68
  ## Model Details
69
 
70
  ### Model Description
71
  - **Model Type:** Sentence Transformer
72
- - **Base model:** [thenlper/gte-small](https://huggingface.co/thenlper/gte-small) <!-- at revision 17e1f347d17fe144873b1201da91788898c639cd -->
73
- - **Maximum Sequence Length:** 64 tokens
74
  - **Output Dimensionality:** 384 dimensions
75
  - **Similarity Function:** Cosine Similarity
76
  <!-- - **Training Dataset:** Unknown -->
@@ -87,7 +87,7 @@ This is a [sentence-transformers](https://www.SBERT.net) model finetuned from [t
87
 
88
  ```
89
  SentenceTransformer(
90
- (0): Transformer({'max_seq_length': 64, 'do_lower_case': False, 'architecture': 'BertModel'})
91
  (1): Pooling({'word_embedding_dimension': 384, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
92
  (2): Normalize()
93
  )
@@ -122,9 +122,9 @@ print(embeddings.shape)
122
  # Get the similarity scores for the embeddings
123
  similarities = model.similarity(embeddings, embeddings)
124
  print(similarities)
125
- # tensor([[0.9999, 0.9036, 0.7702],
126
- # [0.9036, 1.0000, 0.7837],
127
- # [0.7702, 0.7837, 1.0000]])
128
  ```
129
 
130
  <!--
@@ -158,18 +158,24 @@ You can finetune this model on your own dataset.
158
  #### Custom Information Retrieval
159
 
160
  * Dataset: `test`
161
- * Evaluated with <code>ir_evaluator.CustomInformationRetrievalEvaluator</code>
 
 
 
 
 
 
162
 
163
  | Metric | Value |
164
  |:-------------------------------------|:-----------|
165
- | cosine_accuracy@1 | 0.5487 |
166
- | cosine_precision@1 | 0.5487 |
167
- | cosine_recall@1 | 0.5298 |
168
- | **cosine_ndcg@10** | **0.7468** |
169
- | cosine_mrr@1 | 0.5487 |
170
- | cosine_map@100 | 0.6912 |
171
- | cosine_auc_precision_cache_hit_ratio | 0.3198 |
172
- | cosine_auc_similarity_distribution | 0.1529 |
173
 
174
  <!--
175
  ## Bias, Risks and Limitations
@@ -189,13 +195,13 @@ You can finetune this model on your own dataset.
189
  #### Non-Default Hyperparameters
190
 
191
  - `eval_strategy`: steps
192
- - `per_device_train_batch_size`: 512
193
- - `per_device_eval_batch_size`: 512
194
  - `weight_decay`: 0.001
195
  - `adam_beta2`: 0.98
196
  - `adam_epsilon`: 1e-06
197
  - `max_steps`: 100000
198
- - `warmup_ratio`: 0.05
199
  - `bf16`: True
200
  - `load_best_model_at_end`: True
201
  - `ddp_find_unused_parameters`: False
@@ -211,8 +217,8 @@ You can finetune this model on your own dataset.
211
  - `do_predict`: False
212
  - `eval_strategy`: steps
213
  - `prediction_loss_only`: True
214
- - `per_device_train_batch_size`: 512
215
- - `per_device_eval_batch_size`: 512
216
  - `per_gpu_train_batch_size`: None
217
  - `per_gpu_eval_batch_size`: None
218
  - `gradient_accumulation_steps`: 1
@@ -228,7 +234,7 @@ You can finetune this model on your own dataset.
228
  - `max_steps`: 100000
229
  - `lr_scheduler_type`: linear
230
  - `lr_scheduler_kwargs`: {}
231
- - `warmup_ratio`: 0.05
232
  - `warmup_steps`: 0
233
  - `log_level`: passive
234
  - `log_level_replica`: warning
@@ -332,7 +338,7 @@ You can finetune this model on your own dataset.
332
  ### Training Logs
333
  | Epoch | Step | test_cosine_ndcg@10 |
334
  |:-----:|:----:|:-------------------:|
335
- | 0 | 0 | 0.7468 |
336
 
337
 
338
  ### Framework Versions
 
13
  - reranking
14
  - generated_from_trainer
15
  - loss:ArcFaceInBatchLoss
16
+ base_model: sentence-transformers/all-MiniLM-L6-v2
17
  pipeline_tag: sentence-similarity
18
  library_name: sentence-transformers
19
  metrics:
 
36
  type: test
37
  metrics:
38
  - type: cosine_accuracy@1
39
+ value: 0.5474394601032155
40
  name: Cosine Accuracy@1
41
  - type: cosine_precision@1
42
+ value: 0.5474394601032155
43
  name: Cosine Precision@1
44
  - type: cosine_recall@1
45
+ value: 0.5284894589479743
46
  name: Cosine Recall@1
47
  - type: cosine_ndcg@10
48
+ value: 0.7464232866184599
49
  name: Cosine Ndcg@10
50
  - type: cosine_mrr@1
51
+ value: 0.5474394601032155
52
  name: Cosine Mrr@1
53
  - type: cosine_map@100
54
+ value: 0.6905199963377163
55
  name: Cosine Map@100
56
  - type: cosine_auc_precision_cache_hit_ratio
57
+ value: 0.31524254043885996
58
  name: Cosine Auc Precision Cache Hit Ratio
59
  - type: cosine_auc_similarity_distribution
60
+ value: 0.16089488030492544
61
  name: Cosine Auc Similarity Distribution
62
  ---
63
 
64
  # Redis fine-tuned BiEncoder model for semantic caching on LangCache
65
 
66
+ This is a [sentence-transformers](https://www.SBERT.net) model finetuned from [sentence-transformers/all-MiniLM-L6-v2](https://huggingface.co/sentence-transformers/all-MiniLM-L6-v2). It maps sentences & paragraphs to a 384-dimensional dense vector space and can be used for sentence pair similarity.
67
 
68
  ## Model Details
69
 
70
  ### Model Description
71
  - **Model Type:** Sentence Transformer
72
+ - **Base model:** [sentence-transformers/all-MiniLM-L6-v2](https://huggingface.co/sentence-transformers/all-MiniLM-L6-v2) <!-- at revision c9745ed1d9f207416be6d2e6f8de32d1f16199bf -->
73
+ - **Maximum Sequence Length:** 128 tokens
74
  - **Output Dimensionality:** 384 dimensions
75
  - **Similarity Function:** Cosine Similarity
76
  <!-- - **Training Dataset:** Unknown -->
 
87
 
88
  ```
89
  SentenceTransformer(
90
+ (0): Transformer({'max_seq_length': 128, 'do_lower_case': False, 'architecture': 'BertModel'})
91
  (1): Pooling({'word_embedding_dimension': 384, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
92
  (2): Normalize()
93
  )
 
122
  # Get the similarity scores for the embeddings
123
  similarities = model.similarity(embeddings, embeddings)
124
  print(similarities)
125
+ # tensor([[1.0000, 0.6650, 0.1040],
126
+ # [0.6650, 1.0000, 0.1401],
127
+ # [0.1040, 0.1401, 0.9999]])
128
  ```
129
 
130
  <!--
 
158
  #### Custom Information Retrieval
159
 
160
  * Dataset: `test`
161
+ * Evaluated with <code>ir_evaluator.CustomInformationRetrievalEvaluator</code> with these parameters:
162
+ ```json
163
+ {
164
+ "query_prompt": "query:",
165
+ "corpus_prompt": "query:"
166
+ }
167
+ ```
168
 
169
  | Metric | Value |
170
  |:-------------------------------------|:-----------|
171
+ | cosine_accuracy@1 | 0.5474 |
172
+ | cosine_precision@1 | 0.5474 |
173
+ | cosine_recall@1 | 0.5285 |
174
+ | **cosine_ndcg@10** | **0.7464** |
175
+ | cosine_mrr@1 | 0.5474 |
176
+ | cosine_map@100 | 0.6905 |
177
+ | cosine_auc_precision_cache_hit_ratio | 0.3152 |
178
+ | cosine_auc_similarity_distribution | 0.1609 |
179
 
180
  <!--
181
  ## Bias, Risks and Limitations
 
195
  #### Non-Default Hyperparameters
196
 
197
  - `eval_strategy`: steps
198
+ - `per_device_train_batch_size`: 64
199
+ - `per_device_eval_batch_size`: 64
200
  - `weight_decay`: 0.001
201
  - `adam_beta2`: 0.98
202
  - `adam_epsilon`: 1e-06
203
  - `max_steps`: 100000
204
+ - `warmup_ratio`: 0.15
205
  - `bf16`: True
206
  - `load_best_model_at_end`: True
207
  - `ddp_find_unused_parameters`: False
 
217
  - `do_predict`: False
218
  - `eval_strategy`: steps
219
  - `prediction_loss_only`: True
220
+ - `per_device_train_batch_size`: 64
221
+ - `per_device_eval_batch_size`: 64
222
  - `per_gpu_train_batch_size`: None
223
  - `per_gpu_eval_batch_size`: None
224
  - `gradient_accumulation_steps`: 1
 
234
  - `max_steps`: 100000
235
  - `lr_scheduler_type`: linear
236
  - `lr_scheduler_kwargs`: {}
237
+ - `warmup_ratio`: 0.15
238
  - `warmup_steps`: 0
239
  - `log_level`: passive
240
  - `log_level_replica`: warning
 
338
  ### Training Logs
339
  | Epoch | Step | test_cosine_ndcg@10 |
340
  |:-----:|:----:|:-------------------:|
341
+ | 0 | 0 | 0.7464 |
342
 
343
 
344
  ### Framework Versions
config.json CHANGED
@@ -4,7 +4,7 @@
4
  ],
5
  "attention_probs_dropout_prob": 0.1,
6
  "classifier_dropout": null,
7
- "dtype": "bfloat16",
8
  "gradient_checkpointing": false,
9
  "hidden_act": "gelu",
10
  "hidden_dropout_prob": 0.1,
 
4
  ],
5
  "attention_probs_dropout_prob": 0.1,
6
  "classifier_dropout": null,
7
+ "dtype": "float32",
8
  "gradient_checkpointing": false,
9
  "hidden_act": "gelu",
10
  "hidden_dropout_prob": 0.1,
config_sentence_transformers.json CHANGED
@@ -1,10 +1,10 @@
1
  {
2
- "model_type": "SentenceTransformer",
3
  "__version__": {
4
  "sentence_transformers": "5.1.1",
5
  "transformers": "4.57.0",
6
  "pytorch": "2.8.0+cu128"
7
  },
 
8
  "prompts": {
9
  "query": "",
10
  "document": ""
 
1
  {
 
2
  "__version__": {
3
  "sentence_transformers": "5.1.1",
4
  "transformers": "4.57.0",
5
  "pytorch": "2.8.0+cu128"
6
  },
7
+ "model_type": "SentenceTransformer",
8
  "prompts": {
9
  "query": "",
10
  "document": ""
model.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:09112f218af03ce978d0cf802724301336d86633f183d28bbdc548c9ab7a6e01
3
- size 45437864
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:9256db3f3e9170f5e60d958aa67da5f2a6a71e45a24165c8dd916f78af687726
3
+ size 90864192
sentence_bert_config.json CHANGED
@@ -1,4 +1,4 @@
1
  {
2
- "max_seq_length": 64,
3
  "do_lower_case": false
4
  }
 
1
  {
2
+ "max_seq_length": 128,
3
  "do_lower_case": false
4
  }