All OmniEmbed examples appear to be broken
Using transformers version 4.56.2 and torch version 2.8.0 the examples for OmniEmbed from the model card seem to be broken. I am getting ´ValueError: Videos features and imagve tokens do not match: tokens: 0, features 16744´ for video retrieval and a similar error for the image document retrieval example.
Which version of transformers was originally used to demonstrate functionality? This may be related to known issues concerning the Qwen base-model, see https://github.com/QwenLM/Qwen3-VL/issues/556, but this is already known for a year so I am not sure.
Nevermind, found the issue.  texts = processor.apply_chat_template(message, tokenize=False, add_generation_prompt=True) now returns a string (maybe it returned a list in earlier versions?), so slicing it with [0] always makes texts equal to <<|endoftext|>" as only the first '<' from the original string is appended to the EOS string.