Export and use ONNX format?
Hi all,
I am just exported model to ONNX format, via following script:
CONVERSION TO ONNX ------------------
import torch
from transformers import MBartForConditionalGeneration, MBart50TokenizerFast
model_path = "./mbart_large_50_model"
model = MBartForConditionalGeneration.from_pretrained(model_path)
tokenizer = MBart50TokenizerFast.from_pretrained(model_path)
input_text = "This is just simple text"
inputs = tokenizer(input_text, return_tensors="pt")
onnx_path = "./mbart_large_50.onnx"
- export to ONNX format 
 torch.onnx.export(
 model, # PyTorch model
 (inputs["input_ids"],), # input data
 onnx_path, # Pth to the ONNX file
 input_names=["input_ids"], # input tensor names
 output_names=["logits"], # otput tensor names
 dynamic_axes={"input_ids": {0: "batch", 1: "sequence"}},
 opset_version=14
 )
 -----------------END OF CONVERSION-----------------
- got something, but I am not sure if I missed something? 
 -Now, while dealing with original PyTorch model (so, not onnx exporter one), there is a way to specify source and target language for translation:
 ....
 model = MBartForConditionalGeneration.from_pretrained(model_path)
 tokenizer = MBart50TokenizerFast.from_pretrained(model_path)
- specify src language 
 tokenizer.src_lang = "en_XX"
 ...
 generated_tokens = model.generate(**inputs, forced_bos_token_id=tokenizer.lang_code_to_id["hr_HR"])
-Pay attention on:
forced_bos_token_id=tokenizer.lang_code_to_id["hr_HR"]
-Wondering, is there a way (and HOW?)  to do the same with exported ONNX model?
                                      #TRIED SO FAR---
tokenizer = MBart50TokenizerFast.from_pretrained(tokenizer_path)
tokenizer.src_lang = "en_XX"
input_text = "Hello, my name is BART"
inputs = tokenizer(input_text, return_tensors="np")
target_lang_id = tokenizer.lang_code_to_id["hr_HR"]
input_ids = np.array(inputs["input_ids"], dtype=np.int64)
start_token = np.array([[target_lang_id]], dtype=np.int64)
input_ids = np.concatenate((start_token, input_ids), axis=1)
-As you can see, I am trying to put the target_lang_id as first input entry.
but, this way only the first word is translated, not the rest.
-Any help?

