Hello Huggingfacers
I am trying to use 16 bit precision “half()” on the inference of a Marian MT model provided by Huggingface. It seems to reduce quite a lot the memory usage, which is what i am looking for, but i don’t know what to expect in term of translation accuracy after this change.
I am not aware of a method to calculate the BLEU score on a given model, probably i would need a language translation dataset and that would be the way to answer my question by myself
So did any of you guys have some insights on this?