I’ve read about torch not supporting float16 on CPU when trying to test microsoft/phi2 model.
Example threads:
- microsoft/phi-2 · RuntimeError: "LayerNormKernelImpl" not implemented for 'Half'
- Better error message when trying to run fp16 weights on CPU · Issue #96292 · pytorch/pytorch · GitHub
How come the following code works on a GPT2 model where I’m seemingly setting it to run in that config? Is torch_dtype somehow different than the topic in those comments?
pipe = transformers.pipeline('text-generation', model=model, tokenizer=tokenizer, device="cpu", torch_dtype=torch.float16)
res = pipe("Here is a recipe for vegan banana bread:\n",
max_new_tokens=max_new_tokens,
do_sample=False,
use_cache=True)