Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Suggested improvement of eos logic in generate.py #180

Open
vvatter opened this issue Jun 11, 2024 · 0 comments
Open

Suggested improvement of eos logic in generate.py #180

vvatter opened this issue Jun 11, 2024 · 0 comments

Comments

@vvatter
Copy link

vvatter commented Jun 11, 2024

In the generate() function of generate.py, there is some curious XOR logic for updating the boolean is_finished vector:

        if eos_id is not None:
            is_finished = is_finished ^ (next_token == eos_id).cpu()

Even once it reaches an eos token, Mistral likes to keep talking, so this means that if you are running large batches, the shortest response might hit eos and then generate another eos and flip back to is_finished == False before the longest response has finished, which will often keep happening up until you hit max_tokens. It seems to me that this should be an OR.

Additionally, the current approach allows tokens following an EOS to be included in outputs, which, since the tokenizer decodes EOS as an empty string, might contribute to confusing output sequences. This could potentially relate to the issues discussed in #149 .

To address both issues, I suggest the following modifications to ensure that is_finished remains True after encountering an eos token and to not return tokens after this point.

        if eos_id is not None:
            is_finished = is_finished | (next_token == eos_id).cpu()
            next_token = next_token * (~is_finished).to(next_token.device)
            next_token = next_token + eos_id * is_finished.to(next_token.device)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant