I would like to create embeddings for medium-sized paragraphs.
In all the examples given in huggingface feature extraction models, only sentences are given to the tokenizers.
What is the best strategy to achieve that ?
What are the recommended parameters for the tokenization ?
Would you recommend some models in particular ?