Add `text-embeddings-inference` tag & snippet (#40)
Browse files- Add `text-embeddings-inference` tag & snippet (a96370dbfcd3f5d1bd2019a619869da998bc0cd9)
- Fix typos in `README.md` (1becb5be0162de5536342bdd63ca3da088e5a928)
- embeddings models -> embedding models (efb1033715c788d7c26c7597eecd94a1af868ca8)
Co-authored-by: Alvaro Bartolome <alvarobartt@users.noreply.huggingface.co>
    	
        README.md
    CHANGED
    
    | @@ -7,6 +7,7 @@ tags: | |
| 7 | 
             
            - feature-extraction
         | 
| 8 | 
             
            - sentence-similarity
         | 
| 9 | 
             
            - transformers
         | 
|  | |
| 10 | 
             
            datasets:
         | 
| 11 | 
             
            - s2orc
         | 
| 12 | 
             
            - flax-sentence-embeddings/stackexchange_xml
         | 
| @@ -92,6 +93,32 @@ print("Sentence embeddings:") | |
| 92 | 
             
            print(sentence_embeddings)
         | 
| 93 | 
             
            ```
         | 
| 94 |  | 
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
| 95 | 
             
            ------
         | 
| 96 |  | 
| 97 | 
             
            ## Background
         | 
| @@ -100,14 +127,14 @@ The project aims to train sentence embedding models on very large sentence level | |
| 100 | 
             
            contrastive learning objective. We used the pretrained [`microsoft/mpnet-base`](https://huggingface.co/microsoft/mpnet-base) model and fine-tuned in on a 
         | 
| 101 | 
             
            1B sentence pairs dataset. We use a contrastive learning objective: given a sentence from the pair, the model should predict which out of a set of randomly sampled other sentences, was actually paired with it in our dataset.
         | 
| 102 |  | 
| 103 | 
            -
            We  | 
| 104 | 
             
            [Community week using JAX/Flax for NLP & CV](https://discuss.huggingface.co/t/open-to-the-community-community-week-using-jax-flax-for-nlp-cv/7104), 
         | 
| 105 | 
            -
            organized by Hugging Face. We  | 
| 106 | 
             
            [Train the Best Sentence Embedding Model Ever with 1B Training Pairs](https://discuss.huggingface.co/t/train-the-best-sentence-embedding-model-ever-with-1b-training-pairs/7354). We benefited from efficient hardware infrastructure to run the project: 7 TPUs v3-8, as well as intervention from Googles Flax, JAX, and Cloud team member about efficient deep learning frameworks.
         | 
| 107 |  | 
| 108 | 
             
            ## Intended uses
         | 
| 109 |  | 
| 110 | 
            -
            Our model is intented to be used as a sentence and short paragraph encoder. Given an input text, it  | 
| 111 | 
             
            the semantic information. The sentence vector may be used for information retrieval, clustering or sentence similarity tasks.
         | 
| 112 |  | 
| 113 | 
             
            By default, input text longer than 384 word pieces is truncated.
         | 
| @@ -126,7 +153,7 @@ We then apply the cross entropy loss by comparing with true pairs. | |
| 126 |  | 
| 127 | 
             
            #### Hyper parameters
         | 
| 128 |  | 
| 129 | 
            -
            We trained  | 
| 130 | 
             
            We use a learning rate warm up of 500. The sequence length was limited to 128 tokens. We used the AdamW optimizer with
         | 
| 131 | 
             
            a 2e-5 learning rate. The full training script is accessible in this current repository: `train_script.py`.
         | 
| 132 |  | 
|  | |
| 7 | 
             
            - feature-extraction
         | 
| 8 | 
             
            - sentence-similarity
         | 
| 9 | 
             
            - transformers
         | 
| 10 | 
            +
            - text-embeddings-inference
         | 
| 11 | 
             
            datasets:
         | 
| 12 | 
             
            - s2orc
         | 
| 13 | 
             
            - flax-sentence-embeddings/stackexchange_xml
         | 
|  | |
| 93 | 
             
            print(sentence_embeddings)
         | 
| 94 | 
             
            ```
         | 
| 95 |  | 
| 96 | 
            +
            ## Usage (Text Embeddings Inference (TEI))
         | 
| 97 | 
            +
             | 
| 98 | 
            +
            [Text Embeddings Inference (TEI)](https://github.com/huggingface/text-embeddings-inference) is a blazing fast inference solution for text embedding models.
         | 
| 99 | 
            +
             | 
| 100 | 
            +
            - CPU:
         | 
| 101 | 
            +
            ```bash
         | 
| 102 | 
            +
            docker run -p 8080:80 -v hf_cache:/data --pull always ghcr.io/huggingface/text-embeddings-inference:cpu-latest --model-id sentence-transformers/all-mpnet-base-v2 --pooling mean --dtype float16
         | 
| 103 | 
            +
            ```
         | 
| 104 | 
            +
             | 
| 105 | 
            +
            - NVIDIA GPU:
         | 
| 106 | 
            +
            ```bash
         | 
| 107 | 
            +
            docker run --gpus all -p 8080:80 -v hf_cache:/data --pull always ghcr.io/huggingface/text-embeddings-inference:cuda-latest --model-id sentence-transformers/all-mpnet-base-v2 --pooling mean --dtype float16
         | 
| 108 | 
            +
            ```
         | 
| 109 | 
            +
             | 
| 110 | 
            +
            Send a request to `/v1/embeddings` to generate embeddings via the [OpenAI Embeddings API](https://platform.openai.com/docs/api-reference/embeddings/create):
         | 
| 111 | 
            +
            ```bash
         | 
| 112 | 
            +
            curl http://localhost:8080/v1/embeddings \
         | 
| 113 | 
            +
              -H 'Content-Type: application/json' \
         | 
| 114 | 
            +
              -d '{
         | 
| 115 | 
            +
                "model": "sentence-transformers/all-mpnet-base-v2",
         | 
| 116 | 
            +
                "input": ["This is an example sentence", "Each sentence is converted"]
         | 
| 117 | 
            +
              }'
         | 
| 118 | 
            +
            ```
         | 
| 119 | 
            +
             | 
| 120 | 
            +
            Or check the [Text Embeddings Inference API specification](https://huggingface.github.io/text-embeddings-inference/) instead.
         | 
| 121 | 
            +
             | 
| 122 | 
             
            ------
         | 
| 123 |  | 
| 124 | 
             
            ## Background
         | 
|  | |
| 127 | 
             
            contrastive learning objective. We used the pretrained [`microsoft/mpnet-base`](https://huggingface.co/microsoft/mpnet-base) model and fine-tuned in on a 
         | 
| 128 | 
             
            1B sentence pairs dataset. We use a contrastive learning objective: given a sentence from the pair, the model should predict which out of a set of randomly sampled other sentences, was actually paired with it in our dataset.
         | 
| 129 |  | 
| 130 | 
            +
            We developed this model during the 
         | 
| 131 | 
             
            [Community week using JAX/Flax for NLP & CV](https://discuss.huggingface.co/t/open-to-the-community-community-week-using-jax-flax-for-nlp-cv/7104), 
         | 
| 132 | 
            +
            organized by Hugging Face. We developed this model as part of the project:
         | 
| 133 | 
             
            [Train the Best Sentence Embedding Model Ever with 1B Training Pairs](https://discuss.huggingface.co/t/train-the-best-sentence-embedding-model-ever-with-1b-training-pairs/7354). We benefited from efficient hardware infrastructure to run the project: 7 TPUs v3-8, as well as intervention from Googles Flax, JAX, and Cloud team member about efficient deep learning frameworks.
         | 
| 134 |  | 
| 135 | 
             
            ## Intended uses
         | 
| 136 |  | 
| 137 | 
            +
            Our model is intented to be used as a sentence and short paragraph encoder. Given an input text, it outputs a vector which captures 
         | 
| 138 | 
             
            the semantic information. The sentence vector may be used for information retrieval, clustering or sentence similarity tasks.
         | 
| 139 |  | 
| 140 | 
             
            By default, input text longer than 384 word pieces is truncated.
         | 
|  | |
| 153 |  | 
| 154 | 
             
            #### Hyper parameters
         | 
| 155 |  | 
| 156 | 
            +
            We trained our model on a TPU v3-8. We train the model during 100k steps using a batch size of 1024 (128 per TPU core).
         | 
| 157 | 
             
            We use a learning rate warm up of 500. The sequence length was limited to 128 tokens. We used the AdamW optimizer with
         | 
| 158 | 
             
            a 2e-5 learning rate. The full training script is accessible in this current repository: `train_script.py`.
         | 
| 159 |  | 

 
		