SLM-RAG-Arena

Running on Zero

oliver-aizip commited on May 7

Commit

e6127a4

1 Parent(s): b8ee0a2

typo fix

Files changed (1) hide show

utils/models.py CHANGED Viewed

@@ -170,8 +170,8 @@ def run_inference(model_name, context, question, result_queue):
         #     max_length=2048, # Keep original max_length for now
         #     add_generation_prompt=True,
         # ).to(device)
-        output = pipe(text_input, max_new_tokens=512)
-        result = output[0]['generated_text'][-1]['content']
         # # Ensure input does not exceed model max length after adding generation prompt
         # # This check might be redundant if tokenizer handles it, but good for safety
         # # if actual_input.shape[1] > tokenizer.model_max_length:

         #     max_length=2048, # Keep original max_length for now
         #     add_generation_prompt=True,
         # ).to(device)
+        outputs = pipe(text_input, max_new_tokens=512)
+        result = outputs[0]['generated_text'][-1]['content']
         # # Ensure input does not exceed model max length after adding generation prompt
         # # This check might be redundant if tokenizer handles it, but good for safety
         # # if actual_input.shape[1] > tokenizer.model_max_length: