Dmesg: read kernel buffer failed: Operation not permitted :- Running gaudi-enabled habana model inference on kubernetes cluster

Hi there,

Iam trying to run bloom-560m, GPT-J-6B model inference on a kubernetes cluster after connecting the dl1-large resource to it as well as the the habana container image “”.
after doing

  1. pip install optimum[habana]
  2. cd optimum-habana/examples/text-generation
  3. pip install -r requirements.txt
  4. python …/ --use_deepspeed --world_size 2
    –model_name_or_path EleutherAI/gpt-j-6b
    –max_new_tokens 100
    –prompt “Tell me a poem about stone and water”

I am running into this error:-

Tried the dmesg solutions from here
but they didn’t work.
What could be the possible reason for this?

This command runs without issue when executed directly on a DL1 instance. Not sure about what happens when it is executed with Kubernetes, are you sure that 2 devices are reachable in your cluster?

Besides, since both models are small enough to fit on 1 device, I recommend that you run them on 1 device only without DeepSpeed and with the --bf16 argument. Using parallelism with DeepSpeed is useful with very big models that don’t fit on a single device, but that won’t bring any significant speedup for smaller models that do.