Difference between python and rust token size

Hi, I’m working on a project that use the inference endpoint to embed large sentences (close to max token size of my model) I am using the intfloat/multilingual-e5-base, and I got a problem of token size between my python script that call the endpoint and the endpoint tokenizer. Here is an example:

This text
passage: Livre Blanc - How Will Robo-Advisors Reshape Asset Management.pdf \n Does the name Robo Advisor fi t the technology? 12 What is a Robo Advisor? 13 The Onboarding block 14 The Asset Allocation block 14 2 implementation modes 15 3 management modes 15 The main Benefi ts and Drawbacks 16 The Recipe for tomorrow's robo advisory services 17 The Interface: "Towards the end of Mobile Apps?" by Anthony Monteiro 19 Big Data: "When Big Data meets Robo Advisory" by Cecile Graff 20 Artifi cial Intelligence: "The intelligent Investor and Robo Advisors" by Gregory Rogival 21 PART. 2 MARKET INSIGHTS AND BENCHMARKING WHO DOES WHAT? 22 The cost issue and the need to scale up 24 The numerous development strategies 25 7 Partnerships/Acquisitions 25 Advertising and marketing 25 Raise capital 26 Other strategies 26 The main actors worldwide 27 North America 27 Europe 27 Asia 27 Rest of the world 28 What are the important trends to focus on? 29 Takeaway 29 PART. 3 QUALITATIVE INSIGHTS FROM A ROUND OF 13 ROBO ADVISORY SPECIALISTS INTERVIEWS 30 PART. 4 OUR NUMBERS BEYOND THE INTERNET QUANTITATIVE DATA FROM OUR INITIO SURVEY 56 PART. 5 OUR SPECIAL RESULTING INSIGHTS THE 10 COMMANDMENTS FOR A SUCCESSFUL ROBO ADVISOR 74 CONCLUSION 86 ANNEXES 88 Annex 1: Our commercial offer What Initio can do for you 90 Annex 2: The Conference 92 Annex 3: The full transcription of the 13 interviews 94 8 INTRODUCTION INTRODUCTION A 9 mong the trending concepts around digital transformation, robo advisors are and remain with blockchain, big data and artifi cial intelligence one of the hot and unmissable topics for 2018 2019. Either by easing access and execution for investors, by providing them with a low cost way to combine passive and active portfolio management or by being a new smart recommendations wise side kick to asset managers, robo advisors are bringing game changing digital services to the asset management industry Although they are not yet the next gen financial

In python using the tokenizer library it’s size is 497 tokens. however the inference endpoint send me back this error: Input validation error: inputs must have less than 512 tokens. Given: 525

Is there a way to fix this ? or should I simply lower my maximum to 400 tokens so it takes this difference in account ?

PS my python code use only Tokenizer.from_pretrained(“intfloat/multilingual-e5-base”) and len(tokenizer.encode(text).tokens)