Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tokenizer skips the special tokens while decoding #162

Open
anandsarth opened this issue May 27, 2024 · 0 comments
Open

Tokenizer skips the special tokens while decoding #162

anandsarth opened this issue May 27, 2024 · 0 comments

Comments

@anandsarth
Copy link

anandsarth commented May 27, 2024

Now the mistral-7B-v3 has the tool support there are special tokens like [TOOL_CALL] [TOOL_RESULT]. But when I decode the output from the results the special tokens are not present and there is no argument while decode like in huggingface for skip_speical_tokens=False Therefore I am not about to know if the output is tool call or standard response. How can I decode the response from the output tokens

output_tokens = [5,1501, 7567,1629,2032,1113] # here token 5 is the special token about the tool call

#after decoding I get 
result = tokenizer.instruct_tokenizer.tokenizer.decode(out_tokens)

# result [{"name": "get_current_weather", "arguments": {"location": "San Francisco, CA", "format": "celsius"}}]
#therefore I can't tell it is tool call
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant