AutoModel never runs with multiprocessing

Hi,

I’m trying to speed up prediction with an AutoModelForSequenceClassification model.

Here is the code I use

tokenizer = AutoTokenizer.from_pretrained("nlptown/bert-base-multilingual-uncased-sentiment")
model = AutoModelForSequenceClassification.from_pretrained("nlptown/bert-base-multilingual-uncased-sentiment")

def infer(sentence):
    payload = str(sentence.lower())
    seq = tokenizer(payload, return_tensors="pt")
    pred = np.argmax(np.array(model(**seq).logits.tolist()[0]))
    return pred

the infer function works fine when run alone, or in a Python map(infer, list_of_sentences)

However, this doesn’t run (hang forever), even with N = 1

import multiprocessing as mp

with mp.Pool(N) as pool:
    preds = list(pool.map(infer, sample))

What prevents Hugging Face Transformers models from being run in parallel from a multiprocessing pool.map function? (transformers 4.6.0)