Spaces:

microsoft
/

llmlingua-2

Running

App Files Files Community

Fix error when force_tokens includes multi-word sequence to preserve

by cornzz - opened Oct 16, 2024

base: refs/heads/main

←

from: refs/pr/2

Discussion Files changed

-1

cornzz

Oct 16, 2024

•

edited Oct 16, 2024

Right now an error occurs when force_tokens includes a sequence which contains spaces and is contained in the prompt.
This is caused by word, label = line.split(label_sep), where line is for example The answer is 1 (where "The answer is" is the sequence in force_tokens and 1 is the corresponding label).
The error is thrown because line.split(label_sep) returns ['The', 'answer', 'is', '1'], which is too many arguments to be unpacked into word, label.

The fix is to split only at the first occurence of label_sep from the right.

Fix error when force_tokens includes multi-word sequence to preservec07e7a4b

qianhuiwu changed pull request status to merged Nov 8, 2024

qianhuiwu

Microsoft org Nov 8, 2024

Thanks. Merge the fix for string split.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment