Marian MT half precision inference

Hello Huggingfacers :vulcan_salute:

I am trying to use 16 bit precision “half()” on the inference of a Marian MT model provided by Huggingface. It seems to reduce quite a lot the memory usage, which is what i am looking for, but i don’t know what to expect in term of translation accuracy after this change.

I am not aware of a method to calculate the BLEU score on a given model, probably i would need a language translation dataset and that would be the way to answer my question by myself :slight_smile:

So did any of you guys have some insights on this?