If I have legal text data and I want to summarize, so the new word which not present in pre-train_tokenizer (embading), obviously the the cousin semilarty will be zero.
Example
cousin(“judiciare”, “judgement”).
If I have legal text data and I want to summarize, so the new word which not present in pre-train_tokenizer (embading), obviously the the cousin semilarty will be zero.
Example
cousin(“judiciare”, “judgement”).