Spaces:
				
			
			
	
			
			
					
		Running
		
	
	
	
			
			
	
	
	
	
		
		Typo in IFM-TTE-7B results for ViDoRe-V2 under Visual Doc
@Hrant
	
Thanks for pointing it out! I’ve pinged their author who added this model here: https://huggingface.co/spaces/TIGER-Lab/MMEB-Leaderboard/discussions/69
hi 
@haoyubu
	 
@ziyjiang
	
IFM-TTE-7B demonstrates outstanding overall performance on the visdoc task. I noticed that its scores on several datasets are significantly better than other models:
- ViDoRe_esg_reports_human_labeled_v2: +21%
- ViDoRe_esg_reports_v2_multilingual: +22%
- VisRAG_PlotQA: +19%
- ViDoSeek-page: +27%
- MMLongBench-page: +12%
I found that these datasets all have many additional corpus-ids that do not appear in the qrels. However, according to the official evaluation script, these additional corpus-ids should also be added to the candidate set as negative samples. I want to confirm whether IFM-TTE-7B only used all the corpus-ids from the qrels as the candidate set, and did not use the additional corpus-ids from the corpus?
Hi 
@kekekeke
	 , thanks for raising this! From the VLM2Vec/MMEB side, I can confirm that these additional corpus_ids are included in the candidate set during evaluation.(https://github.com/TIGER-AI-Lab/VLM2Vec/blob/main/src/data/eval_dataset/vidore_dataset.py#L59) I think IFM-TTE-7B follows the same approach, but I’ll leave the final confirmation to the authors of that paper.

 
						
