I think your expectations are a little off. You are expecting 1 long location but in reality I think you should expect 3. If I go to the HF website and test your sentence (translated to english) I get 3 locations.
Go to this link:
You will see the english version of what you are testing with in the testing window. So click āComputeā
You will see that it returns each region in your sentence as a separate a separate location entity.
I am guessing that the tagging in PhoBert is teaching RoBERTa to recognize the locations this way. So, if you want it to recognize the whole thing as one entity, I think you will have to create your own data tagged as you expect and train from the Roberta model just like the team that create PhoBert did. Obviously this would take a lot of work and compute time.