I need the probability distribution of word generation to calculate the loss in my original loss function.
In particular, the model differs from a normal loss function in that it generates sentences in the same way as when testing.
Therefore, I have tried to calculate the loss using model.generate(), but this method does not leave me with the calculated graph needed to calculate the gradient. Could this be easily solved by passing a special argument to the method? Or is there an equally simple solution? Or do I have to implement my own function that generates the text in a way that leaves the computed graph?
I can’t answer many of your questions, but I did find this code snippet useful to get a computational graph with
from undecorated import undecorated
from types import MethodType
generate_with_grad = undecorated(model.generate)
model.generate_with_grad = MethodType(generate_with_grad, model)
generate() function has a
no_grad decorator that stops the computational graph being returned, and this code just removes the decorator and leaves the rest of the generate function unchanged.
Thanks for sharing your simple solution!
I’ll give this method a try!
I got scores with computational graph!