fixing hardcoded cuda() for cpu inference
#21
by
						
alexgambashidze
	
							
						- opened
							
					
Fixed hardcoded cuda() to be able to inference with cpu.
alexgambashidze
	
				
		changed pull request title from
		Update modeling_deepseekocr.py
		to fixing hardcoded cuda() for cpu inference
			
and it works on cpu, is there a way it would work on mps?
These file changes need to be merged, but for anyone else looking for a quick workaround, you can replace these lines in the modeling_deepseekocr.py file to get this model working on CPU:
Line 505:
 # inputs_embeds[idx].masked_scatter_(images_seq_mask[idx].unsqueeze(-1).cuda(), images_in_this_batch)
 inputs_embeds[idx].masked_scatter_(images_seq_mask[idx].unsqueeze(-1), images_in_this_batch)
Line 917:
output_ids = self.generate(
    # input_ids.unsqueeze(0).cuda(),
    # images=[(images_crop.cuda(), images_ori.cuda())],
    # images_seq_mask = images_seq_mask.unsqueeze(0).cuda(),
    input_ids.unsqueeze(0),
    images=[(images_crop, images_ori)],
    images_seq_mask = images_seq_mask.unsqueeze(0),
    ...
)
Line 960:
# outputs = tokenizer.decode(output_ids[0, input_ids.unsqueeze(0).cuda().shape[1]:])
outputs = tokenizer.decode(output_ids[0, input_ids.unsqueeze(0).shape[1]:])
Line 971:
# outputs = tokenizer.decode(output_ids[0, input_ids.unsqueeze(0).cuda().shape[1]:])
outputs = tokenizer.decode(output_ids[0, input_ids.unsqueeze(0).shape[1]:])
Did you make these corrections
