Language Model Skips entire Sentence

maxwellu · September 8, 2023, 11:12am

I am messing around with the transformers implementation of HuggingFace to translate strings from english to german. In particular, I try to the following Code.

import torch
Test = True

### NLLB GPU
from transformers import AutoModelForSeq2SeqLM, AutoTokenizer

print(torch.cuda.mem_get_info())
device = "cuda:0" if torch.cuda.is_available() else "cpu"
tokenizer = AutoTokenizer.from_pretrained("facebook/nllb-200-distilled-600M")
model = AutoModelForSeq2SeqLM.from_pretrained("facebook/nllb-200-distilled-600M").to(device)
model.eval()
print(torch.cuda.mem_get_info())

def translationPipeline(text):
    input_ids = tokenizer.encode(text, return_tensors="pt").to(device)
    with torch.no_grad():
        outputs = model.generate(input_ids,forced_bos_token_id=tokenizer.lang_code_to_id["deu_Latn"], max_length = 10000)
    decoded = tokenizer.batch_decode(outputs, skip_special_tokens=True)[0]
    return decoded

# c = df_text.iloc[1,:].body.apply(translationPipeline)
text = "Tech firms’ shares fell after reports that China had barred employees at government-backed agencies and state companies from using iPhones, widening a ban applying to some government staff. Shares in Apple sank by more than 3% on Thursday; its market value has dropped by almost $200bn in the past two days."
text = text
print(tokenizer.encode(text, return_tensors="pt").shape)
with torch.no_grad():
    c = translationPipeline(text)
    
print(torch.cuda.mem_get_info())

This returns the following string Die Aktien von Techfirmen gingen nach Berichten zurück, dass China Mitarbeitern von staatlich unterstützten Agenturen und staatlichen Unternehmen das Verwenden von iPhones verboten hatte und das Verbot für einige Regierungsmitarbeiter erweitert hatte.

Therefore, the first input sentence is translated really well. However, the second input sentence is just ignored completely. What is happening there?

Topic		Replies	Views
Is Facebook NLLB too slow? Models	8	1792	August 30, 2024
Repetitive words in model output Models	1	48	December 18, 2024
HuggingFace - Why does the T5 model shorten sentences? Models	2	748	April 28, 2024
Too strange translation result in NLLB-200-3.3B Models	0	445	September 13, 2023
Next_token ambiguity in Causal Language Modeling sample Beginners	0	365	June 4, 2021

Language Model Skips entire Sentence

Related topics