Tokenizer: what function removes spaces between '<' and '>'?

Here is an detokenized sequence:

print(tokenizer.decode(generated_ids[0][input_ids.shape[-1]:], skip_special_tokens=False, clean_up_tokenization_space=False))

<tool_call>
{“arguments”: {“symbol”: “AAPL”}, “name”: “get_stock_fundamentals”}
</tool_call><|im_end|>

As you see, </tool_call> is a construct without spaces. The original sequence of token ids is
700 6462 28730 2845 28767 which is
“</” “tool” “_” “call” “>”

What Transformers function implements the logic of removal of the spaces? How does it know that ‘tool’, ‘_’ and ‘call’ are part of one keyword?

Would appreciate your guidance.

1 Like