Bug in Offset generation for Rupee symbol

Hi, When I am using the rupee symbol in a sentence Offset is dividing that symbol into 3 different symbols but instead of having (0,1)(1,2)(2,3), it is giving (0,1)(0,1)(0,1) which is causing issues in a mismatch between actual words and generated labels.For example

from transformers import AutoTokenizer

tokenizer = AutoTokenizer.from_pretrained(‘distilroberta-base’, add_prefix_space=True)

sent=“total amount that need to be paid is ₹ 500”

words=sent.split()

output=tokenizer(words, is_split_into_words=True,return_offsets_mapping=True)

tokens=output.tokens()

offset=output[‘offset_mapping’]

for token,offset in zip(tokens,offset):

print(token,“----->”,offset)

I am getting the following output
-----> (0, 0)
Ġtotal -----> (0, 5)
Ġamount -----> (0, 6)
Ġthat -----> (0, 4)
Ġneed -----> (0, 4)
Ġto -----> (0, 2)
Ġbe -----> (0, 2)
Ġpaid -----> (0, 4)
Ġis -----> (0, 2)
Ġâ -----> (0, 1) #problem
Ĥ -----> (0, 1)#problem
¹ -----> (0, 1)#poblem
Ġ500 -----> (0, 3)
-----> (0, 0)

As you can see above rupee symbol got divided in to 3 different labels but offset is still (0,1) for all three symbols