Hi.
I’m trying to use MarianModels for back translation as data augmentation. However, it’s too slow even using multiple GPUs. and I also can not use a batch size larger than 16 setting the max length to 300 though. Indeed it takes one day to complete half an epoch.
following is the code I’m using
target_langs = ['fr,wa,frp,oc,ca,rm,lld,fur,lij,lmo,es,pt,gl,lad,an,mwl,it,co,nap,scn,vec,sc,ro,la']
def translate(texts, model, tokenizer, language="fr"):
with torch.no_grad():
template = lambda text: f"{text}" if language == "en" else f">>{language}<< {text}"
src_texts = [template(text) for text in texts]
encoded = tokenizer.prepare_seq2seq_batch(src_texts,
truncation=True,
max_length=300, return_tensors="pt").to(device)
translated = model.module.generate(**encoded).to(device)
translated_texts = tokenizer.batch_decode(translated, skip_special_tokens=True)
return translated_texts
def back_translate(texts, source_lang="en", target_lang="fr"):
# Translate from source to target language
fr_texts = translate(texts, target_model, target_tokenizer,
language=target_lang)
# Translate from target language back to source language
back_translated_texts = translate(fr_texts, en_model, en_tokenizer,
language=source_lang)
return back_translated_texts
target_model_name = 'Helsinki-NLP/opus-mt-en-de'
target_tokenizer = MarianTokenizer.from_pretrained(target_model_name)
target_model = MarianMTModel.from_pretrained(target_model_name)
en_model_name = 'Helsinki-NLP/opus-mt-de-en'
en_tokenizer = MarianTokenizer.from_pretrained(en_model_name)
en_model = MarianMTModel.from_pretrained(en_model_name)
target_model = nn.DataParallel(target_model)
target_model = target_model.to(device) # same performance if I add .half()
target_model.eval()
en_model = nn.DataParallel(en_model)
en_model = en_model.to(device)# same performance if I add .half()
en_model.eval()
## x1 and x2 are batches of strings.
bk_x1 = back_translate(x1, source_lang="en", target_lang=np.random.choice(target_langs))
bk_x2 = back_translate(x2, source_lang="en", target_lang=np.random.choice(target_langs))
here are GPU’s performances: low utilization due to small batch size 16 but if I increase the batch size I got Cuda out of memory error. also, I can see only one gpu is used for processing so might be that the Marian model can not be parallelized correctly. if so what would be the solution?
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|===============================+======================+======================|
| 0 GeForce GTX 108... Off | 00000000:1B:00.0 Off | N/A |
| 42% 78C P2 199W / 250W | 9777MiB / 11178MiB | 91% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
| 1 GeForce GTX 108... Off | 00000000:1C:00.0 Off | N/A |
| 29% 36C P8 10W / 250W | 2MiB / 11178MiB | 0% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
| 2 GeForce GTX 108... Off | 00000000:1D:00.0 Off | N/A |
| 31% 36C P8 9W / 250W | 2MiB / 11178MiB | 0% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
| 3 GeForce GTX 108... Off | 00000000:1E:00.0 Off | N/A |
| 35% 41C P8 9W / 250W | 2MiB / 11178MiB | 0% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
| 4 GeForce GTX 108... Off | 00000000:3D:00.0 Off | N/A |
| 29% 34C P8 9W / 250W | 2MiB / 11178MiB | 0% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
| 5 GeForce GTX 108... Off | 00000000:3F:00.0 Off | N/A |
| 30% 31C P8 8W / 250W | 2MiB / 11178MiB | 0% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
| 6 GeForce GTX 108... Off | 00000000:40:00.0 Off | N/A |
| 31% 38C P8 9W / 250W | 2MiB / 11178MiB | 0% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
| 7 GeForce GTX 108... Off | 00000000:41:00.0 Off | N/A |
| 30% 37C P8 9W / 250W | 2MiB / 11178MiB | 0% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=============================================================================|
| 0 N/A N/A 58780 C python 10407MiB |
| 1 N/A N/A 58780 C python 0MiB |
| 2 N/A N/A 58780 C python 0MiB |
| 3 N/A N/A 58780 C python 0MiB |
| 4 N/A N/A 58780 C python 0MiB |
| 5 N/A N/A 58780 C python 0MiB |
| 6 N/A N/A 58780 C python 0MiB |
| 7 N/A N/A 58780 C python 0MiB |
+-----------------------------------------------------------------------------+
FYI: I’m using
pytorch 1. 1.7.0
transformers 4.0.1
cudda 10.1