Has anyone been able to get Deepspeed working for inference with GPT-Neo, on a finetuned model?
As per this GitHub issue:
https://github.com/Xirider/finetune-gpt2xl/issues/15
…I am finding that inference with Deepspeed works well on the un-finetuned model, “EleutherAI/gpt-neo-2.7B”
… But after I fine tune the model, inference with Deepspeed fails with this error message:
File "/opt/conda/lib/python3.8/site-packages/deepspeed/ops/transformer/inference/transformer_inference.py", line 374, in forward
output = DeepSpeedSelfAttentionFunction.apply(
File "/opt/conda/lib/python3.8/site-packages/deepspeed/ops/transformer/inference/transformer_inference.py", line 312, in forward
output, key_layer, value_layer, context_layer = selfAttention_fp()
File "/opt/conda/lib/python3.8/site-packages/deepspeed/ops/transformer/inference/transformer_inference.py", line 270, in selfAttention_fp
qkv_out = qkv_func(input,
IndexError: Dimension out of range (expected to be in range of [-2, 1], but got 2)