Zero-Shot Image Classification
Transformers
Safetensors
siglip2
vision

Not getting great semantic search results with this model

#5
by stephenmarsh - opened

Has anyone else had difficulty getting this model to yield decent results for semantic text searches of images? I put in the first 30,000 or so images from The Met's collection. So far my best result is "chair" giving me some chairs but also vases and tables.

A search for "horse" yields no horses, only seemingly random objects.

Screenshot 2025-09-22 at 12.27.08 AM.png

I spent many hours sanity-checking and debugging these issues and regenerating embeddings, but I can't get any better performance out of it.

Anyone experiencing similar or have any advice/guidance/wisdom on the situation?

Thanks in advance

Sign up or log in to comment