notebook link: Abstractive-Summarization-T5-Keras/AbstractiveSummarizationT5.ipynb at main · flogothetis/Abstractive-Summarization-T5-Keras · GitHub
I just have a problem, i wanna run the saved model and run that with generate function of hugging face, the generate functions gives this opportunity to use num_return_sequences and max_len of generated output
it’s the code i use to run saved model with generate() hugging face but it returns trash reply and not a true response! I.e for summerizing this below text:
getSummary("With your permission we and our partners may use precise geolocation
data and identification through device scanning. You may click to consent to our
and our partners’ processing as described above. Alternatively you may access more
detailed information and change your preferences before consenting or to refuse consenting.")
the getSummary returns something like this:
We may use geolocation data through device scanning
but when i use the generate function of hugging face, the code is as below:
text=‘’‘With your permission we and our partners may use precise geolocation data and identification through device scanning. You may click to consent to our and our partners’ processing as described above. Alternatively you may access more detailed information and change your preferences before consenting or to refuse consenting.’‘’
`tokenizer = T5Tokenizer.from_pretrained('t5-small')
model0 = TFT5ForConditionalGeneration.from_pretrained(to_directory)
inputs = tokenizer([text], return_tensors="tf")
generated = model0.generate(**inputs,decoder_start_token_id=tokenizer.pad_token_id,do_sample=True)
print("Sampling output: ", tokenizer.decode(generated[0]))`
the output is trash like this: Sampling output: kurz Upholster Month citoyenjohnpointviousgren suppression awful Tommy Partners animaux Certain temptationanischadenCenterani FUN awful partager Lexington Ãœb
it generate output but output is trash and no mean! if i tray model.generate() without this decoder_start_token_id=tokenizer.pad_token_id
generated = model.generate(**inputs,do_sample=True)
it reurns an error as: decoder_start_token_id or bos_token_id has to be defined for encoder-decoder generation
is there any solotion to lead model.generate() return true reply?
by the way start and end tokens are as below;
print(end_token,tokenizer.eos_token_id)# 1 print(start_token,tokenizer.pad_token_id)# 1
any solutions? much obliged
i tried multiple ways but it did not work at all?
so smart one has any idea?