Hi Community - I’ve been playing around with converting HF models to CoreML for native, on-device use.
I’ve been able to convert GPT2 and basic BERT models but am having issues with BigBird-Pegasus.
I’m having a host of errors from “Tracer Warnings” to pytorch deprecation warnings. I’ve gone through the original paper, but there is scant information on the implementation for me to get a solid sense of the input shape and conversion requirements. I’m not sure what to even script, because the sparse attention is new to me. I’m hesitant to fork the HF implementation to knock down the obvious errors, because of the sheer volume.
Considering the dynamic input of the model, I am not positive what the input should be, so I played with various random tokens and tokenized inputs. Can’t even get past the trace.
Any help would be great. My current script and the full error are below (yes, I am aware of the Torch version warning).
Conversion Script
import coremltools
import os
import torch
from torchsummary import summary
from transformers import AutoTokenizer, AutoModelForSeq2SeqLMtransformers_cache_dir = os.environ.get(‘TRANSFORMERS_CACHE’) if >os.environ.get(‘TRANSFORMERS_CACHE’) else “/Users/m/transformers_cache/”
Load pre-trained model.
torch_model = AutoModelForSeq2SeqLM.from_pretrained(“google/bigbird-pegasus-large-pubmed”, torchscript=True, cache_dir=transformers_cache_dir)
Load tokenizer.
tokenizer = AutoTokenizer.from_pretrained(“google/bigbird-pegasus-large-pubmed”, cache_dir=transformers_cache_dir)
Set model to evaluation mode.
torch_model.eval()
Create dummy input.
random_tokens = torch.randint(10000, (1,4096))
traced_model = torch.jit.trace(torch_model, random_tokens) <----Errors out here.
Error
WARNING:root:Torch version 1.10.1 has not been tested with coremltools. You may run into unexpected errors. Torch 1.9.1 is the most recent version that has been tested.
{‘input_ids’: tensor([[ 8783, 47694, 15934, …, 110, 105, 1]]), ‘attention_mask’: tensor([[1, 1, 1, …, 1, 1, 1]])}
/Users/m/venvs/pytorch-coremltools-transformers/lib/python3.9/site-packages/transformers/models/bigbird_pegasus/modeling_bigbird_pegasus.py:1855: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can’t record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
if self.attention_type == “block_sparse” and input_shape[1] <= max_tokens_to_attend:
/Users/m/venvs/pytorch-coremltools-transformers/lib/python3.9/site-packages/transformers/models/bigbird_pegasus/modeling_bigbird_pegasus.py:2019: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can’t record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
if padding_len > 0:
/Users/m/venvs/pytorch-coremltools-transformers/lib/python3.9/site-packages/transformers/models/bigbird_pegasus/modeling_bigbird_pegasus.py:1979: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can’t record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
assert (
/Users/m/venvs/pytorch-coremltools-transformers/lib/python3.9/site-packages/transformers/models/bigbird_pegasus/modeling_bigbird_pegasus.py:2004: UserWarning: floordiv is deprecated, and its behavior will change in a future version of pytorch. It currently rounds toward 0 (like the ‘trunc’ function NOT ‘floor’). This results in incorrect rounding for negative values. To keep the current behavior, use torch.div(a, b, rounding_mode=‘trunc’), or for actual floor division, use torch.div(a, b, rounding_mode=‘floor’).
blocked_encoder_mask = attention_mask.view(batch_size, seq_length // block_size, block_size)
/Users/m/venvs/pytorch-coremltools-transformers/lib/python3.9/site-packages/transformers/models/bigbird_pegasus/modeling_bigbird_pegasus.py:277: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can’t record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
assert from_seq_length % from_block_size == 0, “Query sided sequence length must be multiple of block size”
/Users/m/venvs/pytorch-coremltools-transformers/lib/python3.9/site-packages/transformers/models/bigbird_pegasus/modeling_bigbird_pegasus.py:278: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can’t record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
assert to_seq_length % to_block_size == 0, “Key/Value sided sequence length must be multiple of block size”
/Users/m/venvs/pytorch-coremltools-transformers/lib/python3.9/site-packages/transformers/models/bigbird_pegasus/modeling_bigbird_pegasus.py:374: UserWarning: floordiv is deprecated, and its behavior will change in a future version of pytorch. It currently rounds toward 0 (like the ‘trunc’ function NOT ‘floor’). This results in incorrect rounding for negative values. To keep the current behavior, use torch.div(a, b, rounding_mode=‘trunc’), or for actual floor division, use torch.div(a, b, rounding_mode=‘floor’).
if from_seq_len // from_block_size != to_seq_len // to_block_size:
/Users/m/venvs/pytorch-coremltools-transformers/lib/python3.9/site-packages/transformers/models/bigbird_pegasus/modeling_bigbird_pegasus.py:374: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can’t record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
if from_seq_len // from_block_size != to_seq_len // to_block_size:
/Users/m/venvs/pytorch-coremltools-transformers/lib/python3.9/site-packages/transformers/models/bigbird_pegasus/modeling_bigbird_pegasus.py:383: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can’t record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
if from_seq_len in [1024, 3072, 4096]: # old plans used in paper
/Users/m/venvs/pytorch-coremltools-transformers/lib/python3.9/site-packages/transformers/models/bigbird_pegasus/modeling_bigbird_pegasus.py:857: UserWarning: floordiv is deprecated, and its behavior will change in a future version of pytorch. It currently rounds toward 0 (like the ‘trunc’ function NOT ‘floor’). This results in incorrect rounding for negative values. To keep the current behavior, use torch.div(a, b, rounding_mode=‘trunc’), or for actual floor division, use torch.div(a, b, rounding_mode=‘floor’).
if (2 * num_rand_blocks + 5) < (from_seq_length // from_block_size):
/Users/m/venvs/pytorch-coremltools-transformers/lib/python3.9/site-packages/transformers/models/bigbird_pegasus/modeling_bigbird_pegasus.py:857: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can’t record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
if (2 * num_rand_blocks + 5) < (from_seq_length // from_block_size):
/Users/m/venvs/pytorch-coremltools-transformers/lib/python3.9/site-packages/transformers/models/bigbird_pegasus/modeling_bigbird_pegasus.py:971: UserWarning: floordiv is deprecated, and its behavior will change in a future version of pytorch. It currently rounds toward 0 (like the ‘trunc’ function NOT ‘floor’). This results in incorrect rounding for negative values. To keep the current behavior, use torch.div(a, b, rounding_mode=‘trunc’), or for actual floor division, use torch.div(a, b, rounding_mode=‘floor’).
from_seq_length // from_block_size == to_seq_length // to_block_size
/Users/m/venvs/pytorch-coremltools-transformers/lib/python3.9/site-packages/transformers/models/bigbird_pegasus/modeling_bigbird_pegasus.py:970: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can’t record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
assert (
/Users/m/venvs/pytorch-coremltools-transformers/lib/python3.9/site-packages/transformers/models/bigbird_pegasus/modeling_bigbird_pegasus.py:974: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can’t record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
assert from_seq_length in plan_from_length, “Error from sequence length not in plan!”
/Users/m/venvs/pytorch-coremltools-transformers/lib/python3.9/site-packages/transformers/models/bigbird_pegasus/modeling_bigbird_pegasus.py:977: UserWarning: floordiv is deprecated, and its behavior will change in a future version of pytorch. It currently rounds toward 0 (like the ‘trunc’ function NOT ‘floor’). This results in incorrect rounding for negative values. To keep the current behavior, use torch.div(a, b, rounding_mode=‘trunc’), or for actual floor division, use torch.div(a, b, rounding_mode=‘floor’).
num_blocks = from_seq_length // from_block_size
/Users/m/venvs/pytorch-coremltools-transformers/lib/python3.9/site-packages/transformers/models/bigbird_pegasus/modeling_bigbird_pegasus.py:979: TracerWarning: Converting a tensor to a Python integer might cause the trace to be incorrect. We can’t record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
plan_block_length = np.array(plan_from_length) // from_block_size
/Users/m/venvs/pytorch-coremltools-transformers/lib/python3.9/site-packages/transformers/models/bigbird_pegasus/modeling_bigbird_pegasus.py:981: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can’t record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
max_plan_idx = plan_from_length.index(from_seq_length)
/Users/m/venvs/pytorch-coremltools-transformers/lib/python3.9/site-packages/transformers/models/bigbird_pegasus/modeling_bigbird_pegasus.py:407: TracerWarning: torch.tensor results are registered as constants in the trace. You can safely ignore this warning if you use this function to create tensors out of constant variables that would be the same every time you call this function. In any other case, this might cause the trace to be incorrect.
rand_attn = torch.tensor(rand_attn, device=query_layer.device, dtype=torch.long)
/Users/m/venvs/pytorch-coremltools-transformers/lib/python3.9/site-packages/transformers/models/bigbird_pegasus/modeling_bigbird_pegasus.py:834: UserWarning: floordiv is deprecated, and its behavior will change in a future version of pytorch. It currently rounds toward 0 (like the ‘trunc’ function NOT ‘floor’). This results in incorrect rounding for negative values. To keep the current behavior, use torch.div(a, b, rounding_mode=‘trunc’), or for actual floor division, use torch.div(a, b, rounding_mode=‘floor’).
num_windows = from_seq_length // from_block_size - 2
/Users/m/venvs/pytorch-coremltools-transformers/lib/python3.9/site-packages/transformers/models/bigbird_pegasus/modeling_bigbird_pegasus.py:835: TracerWarning: Iterating over a tensor might cause the trace to be incorrect. Passing a tensor of different shape won’t change the number of iterations executed (and might lead to errors or silently give incorrect results).
rand_mask = torch.stack([p1[i1.flatten()] for p1, i1 in zip(to_blocked_mask, rand_attn)])
/Users/m/venvs/pytorch-coremltools-transformers/lib/python3.9/site-packages/transformers/models/bigbird_pegasus/modeling_bigbird_pegasus.py:415: UserWarning: floordiv is deprecated, and its behavior will change in a future version of pytorch. It currently rounds toward 0 (like the ‘trunc’ function NOT ‘floor’). This results in incorrect rounding for negative values. To keep the current behavior, use torch.div(a, b, rounding_mode=‘trunc’), or for actual floor division, use torch.div(a, b, rounding_mode=‘floor’).
blocked_query_matrix = query_layer.view(bsz, n_heads, from_seq_len // from_block_size, from_block_size, -1)
/Users/m/venvs/pytorch-coremltools-transformers/lib/python3.9/site-packages/transformers/models/bigbird_pegasus/modeling_bigbird_pegasus.py:416: UserWarning: floordiv is deprecated, and its behavior will change in a future version of pytorch. It currently rounds toward 0 (like the ‘trunc’ function NOT ‘floor’). This results in incorrect rounding for negative values. To keep the current behavior, use torch.div(a, b, rounding_mode=‘trunc’), or for actual floor division, use torch.div(a, b, rounding_mode=‘floor’).
blocked_key_matrix = key_layer.view(bsz, n_heads, to_seq_len // to_block_size, to_block_size, -1)
/Users/m/venvs/pytorch-coremltools-transformers/lib/python3.9/site-packages/transformers/models/bigbird_pegasus/modeling_bigbird_pegasus.py:417: UserWarning: floordiv is deprecated, and its behavior will change in a future version of pytorch. It currently rounds toward 0 (like the ‘trunc’ function NOT ‘floor’). This results in incorrect rounding for negative values. To keep the current behavior, use torch.div(a, b, rounding_mode=‘trunc’), or for actual floor division, use torch.div(a, b, rounding_mode=‘floor’).
blocked_value_matrix = value_layer.view(bsz, n_heads, to_seq_len // to_block_size, to_block_size, -1)
/Users/m/venvs/pytorch-coremltools-transformers/lib/python3.9/site-packages/transformers/models/bigbird_pegasus/modeling_bigbird_pegasus.py:781: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can’t record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
if params.shape[:2] != indices.shape[:2]:
/Users/m/venvs/pytorch-coremltools-transformers/lib/python3.9/site-packages/transformers/models/bigbird_pegasus/modeling_bigbird_pegasus.py:790: UserWarning: floordiv is deprecated, and its behavior will change in a future version of pytorch. It currently rounds toward 0 (like the ‘trunc’ function NOT ‘floor’). This results in incorrect rounding for negative values. To keep the current behavior, use torch.div(a, b, rounding_mode=‘trunc’), or for actual floor division, use torch.div(a, b, rounding_mode=‘floor’).
torch.arange(indices.shape[0] * indices.shape[1] * num_indices_to_gather, device=indices.device)
/Users/m/venvs/pytorch-coremltools-transformers/lib/python3.9/site-packages/transformers/models/bigbird_pegasus/modeling_bigbird_pegasus.py:422: UserWarning: floordiv is deprecated, and its behavior will change in a future version of pytorch. It currently rounds toward 0 (like the ‘trunc’ function NOT ‘floor’). This results in incorrect rounding for negative values. To keep the current behavior, use torch.div(a, b, rounding_mode=‘trunc’), or for actual floor division, use torch.div(a, b, rounding_mode=‘floor’).
bsz, n_heads, to_seq_len // to_block_size - 2, n_rand_blocks * to_block_size, -1
/Users/m/venvs/pytorch-coremltools-transformers/lib/python3.9/site-packages/transformers/models/bigbird_pegasus/modeling_bigbird_pegasus.py:426: UserWarning: floordiv is deprecated, and its behavior will change in a future version of pytorch. It currently rounds toward 0 (like the ‘trunc’ function NOT ‘floor’). This results in incorrect rounding for negative values. To keep the current behavior, use torch.div(a, b, rounding_mode=‘trunc’), or for actual floor division, use torch.div(a, b, rounding_mode=‘floor’).
bsz, n_heads, to_seq_len // to_block_size - 2, n_rand_blocks * to_block_size, -1
/Users/m/venvs/pytorch-coremltools-transformers/lib/python3.9/site-packages/transformers/models/bigbird_pegasus/modeling_bigbird_pegasus.py:1950: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can’t record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
if padding_len > 0:
/Users/m/venvs/pytorch-coremltools-transformers/lib/python3.9/site-packages/transformers/models/bigbird_pegasus/modeling_bigbird_pegasus.py:2082: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can’t record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
if input_shape[-1] > 1:
/Users/m/venvs/pytorch-coremltools-transformers/lib/python3.9/site-packages/transformers/models/bigbird_pegasus/modeling_bigbird_pegasus.py:1290: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can’t record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
if attn_weights.size() != (bsz * self.num_heads, tgt_len, src_len):
/Users/m/venvs/pytorch-coremltools-transformers/lib/python3.9/site-packages/transformers/models/bigbird_pegasus/modeling_bigbird_pegasus.py:1296: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can’t record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
if attention_mask.size() != (bsz, 1, tgt_len, src_len):
/Users/m/venvs/pytorch-coremltools-transformers/lib/python3.9/site-packages/transformers/models/bigbird_pegasus/modeling_bigbird_pegasus.py:1327: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can’t record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
if attn_output.size() != (bsz * self.num_heads, tgt_len, self.head_dim):
Traceback (most recent call last):
File “/Users/m/venvs/pytorch-coremltools-transformers/lib/python3.9/site-packages/torch/jit/_trace.py”, line 443, in run_mod_and_filter_tensor_outputs
outs = wrap_retval(mod(*_clone_inputs(inputs)))
RuntimeError: The following operation failed in the TorchScript interpreter.
Traceback of TorchScript (most recent call last):
/Users/m/venvs/pytorch-coremltools-transformers/lib/python3.9/site-packages/transformers/models/bigbird_pegasus/modeling_bigbird_pegasus.py(408): bigbird_block_sparse_attention
/Users/m/venvs/pytorch-coremltools-transformers/lib/python3.9/site-packages/transformers/models/bigbird_pegasus/modeling_bigbird_pegasus.py(284): forward
/Users/m/venvs/pytorch-coremltools-transformers/lib/python3.9/site-packages/torch/nn/modules/module.py(1090): _slow_forward
/Users/m/venvs/pytorch-coremltools-transformers/lib/python3.9/site-packages/torch/nn/modules/module.py(1102): _call_impl
/Users/m/venvs/pytorch-coremltools-transformers/lib/python3.9/site-packages/transformers/models/bigbird_pegasus/modeling_bigbird_pegasus.py(1190): forward
/Users/m/venvs/pytorch-coremltools-transformers/lib/python3.9/site-packages/torch/nn/modules/module.py(1090): _slow_forward
/Users/m/venvs/pytorch-coremltools-transformers/lib/python3.9/site-packages/torch/nn/modules/module.py(1102): _call_impl
/Users/m/venvs/pytorch-coremltools-transformers/lib/python3.9/site-packages/transformers/models/bigbird_pegasus/modeling_bigbird_pegasus.py(1382): forward
/Users/m/venvs/pytorch-coremltools-transformers/lib/python3.9/site-packages/torch/nn/modules/module.py(1090): _slow_forward
/Users/m/venvs/pytorch-coremltools-transformers/lib/python3.9/site-packages/torch/nn/modules/module.py(1102): _call_impl
/Users/m/venvs/pytorch-coremltools-transformers/lib/python3.9/site-packages/transformers/models/bigbird_pegasus/modeling_bigbird_pegasus.py(1928): forward
/Users/m/venvs/pytorch-coremltools-transformers/lib/python3.9/site-packages/torch/nn/modules/module.py(1090): _slow_forward
/Users/m/venvs/pytorch-coremltools-transformers/lib/python3.9/site-packages/torch/nn/modules/module.py(1102): _call_impl
/Users/m/venvs/pytorch-coremltools-transformers/lib/python3.9/site-packages/transformers/models/bigbird_pegasus/modeling_bigbird_pegasus.py(2391): forward
/Users/m/venvs/pytorch-coremltools-transformers/lib/python3.9/site-packages/torch/nn/modules/module.py(1090): _slow_forward
/Users/m/venvs/pytorch-coremltools-transformers/lib/python3.9/site-packages/torch/nn/modules/module.py(1102): _call_impl
/Users/m/venvs/pytorch-coremltools-transformers/lib/python3.9/site-packages/transformers/models/bigbird_pegasus/modeling_bigbird_pegasus.py(2520): forward
/Users/m/venvs/pytorch-coremltools-transformers/lib/python3.9/site-packages/torch/nn/modules/module.py(1090): _slow_forward
/Users/m/venvs/pytorch-coremltools-transformers/lib/python3.9/site-packages/torch/nn/modules/module.py(1102): _call_impl
/Users/m/venvs/pytorch-coremltools-transformers/lib/python3.9/site-packages/torch/jit/trace.py(958): trace_module
/Users/m/venvs/pytorch-coremltools-transformers/lib/python3.9/site-packages/torch/jit/trace.py(741): trace
/Users/m/Open Source/bigbird-pegasus-large-pubmed/torch_to_coreml.py(122):
RuntimeError: set_storage_offset is not allowed on a Tensor created from .data or .detach().
If your intent is to change the metadata of a Tensor (such as sizes / strides / storage / storage_offset)
without autograd tracking the change, remove the .data / .detach() call and wrap the change in awith torch.no_grad():
block.
For example, change:
x.data.set(y)
to:
with torch.no_grad():
x.set(y)The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File “/Users/m/Open Source/bigbird-pegasus-large-pubmed/torch_to_coreml.py”, line 122, in
traced_model = torch.jit.trace(torch_model, inputs[‘input_ids’])
File “/Users/m/venvs/pytorch-coremltools-transformers/lib/python3.9/site-packages/torch/jit/_trace.py”, line 741, in trace
return trace_module(
File “/Users/m/venvs/pytorch-coremltools-transformers/lib/python3.9/site-packages/torch/jit/_trace.py”, line 983, in trace_module
_check_trace(
File “/Users/m/venvs/pytorch-coremltools-transformers/lib/python3.9/site-packages/torch/autograd/grad_mode.py”, line 28, in decorate_context
return func(*args, **kwargs)
File “/Users/m/venvs/pytorch-coremltools-transformers/lib/python3.9/site-packages/torch/jit/_trace.py”, line 516, in _check_trace
traced_outs = run_mod_and_filter_tensor_outputs(traced_func, inputs, “trace”)
File “/Users/m/venvs/pytorch-coremltools-transformers/lib/python3.9/site-packages/torch/jit/_trace.py”, line 449, in run_mod_and_filter_tensor_outputs
raise TracingCheckError(
torch.jit._trace.TracingCheckError: Tracing failed sanity checks!
encountered an exception while running the trace with test inputs.
Exception:
The following operation failed in the TorchScript interpreter.
Traceback of TorchScript (most recent call last):
/Users/m/venvs/pytorch-coremltools-transformers/lib/python3.9/site-packages/transformers/models/bigbird_pegasus/modeling_bigbird_pegasus.py(408): bigbird_block_sparse_attention
/Users/m/venvs/pytorch-coremltools-transformers/lib/python3.9/site-packages/transformers/models/bigbird_pegasus/modeling_bigbird_pegasus.py(284): forward
/Users/m/venvs/pytorch-coremltools-transformers/lib/python3.9/site-packages/torch/nn/modules/module.py(1090): _slow_forward
/Users/m/venvs/pytorch-coremltools-transformers/lib/python3.9/site-packages/torch/nn/modules/module.py(1102): _call_impl
/Users/m/venvs/pytorch-coremltools-transformers/lib/python3.9/site-packages/transformers/models/bigbird_pegasus/modeling_bigbird_pegasus.py(1190): forward
/Users/m/venvs/pytorch-coremltools-transformers/lib/python3.9/site-packages/torch/nn/modules/module.py(1090): _slow_forward
/Users/m/venvs/pytorch-coremltools-transformers/lib/python3.9/site-packages/torch/nn/modules/module.py(1102): _call_impl
/Users/m/venvs/pytorch-coremltools-transformers/lib/python3.9/site-packages/transformers/models/bigbird_pegasus/modeling_bigbird_pegasus.py(1382): forward
/Users/m/venvs/pytorch-coremltools-transformers/lib/python3.9/site-packages/torch/nn/modules/module.py(1090): _slow_forward
/Users/m/venvs/pytorch-coremltools-transformers/lib/python3.9/site-packages/torch/nn/modules/module.py(1102): _call_impl
/Users/m/venvs/pytorch-coremltools-transformers/lib/python3.9/site-packages/transformers/models/bigbird_pegasus/modeling_bigbird_pegasus.py(1928): forward
/Users/m/venvs/pytorch-coremltools-transformers/lib/python3.9/site-packages/torch/nn/modules/module.py(1090): _slow_forward
/Users/m/venvs/pytorch-coremltools-transformers/lib/python3.9/site-packages/torch/nn/modules/module.py(1102): _call_impl
/Users/m/venvs/pytorch-coremltools-transformers/lib/python3.9/site-packages/transformers/models/bigbird_pegasus/modeling_bigbird_pegasus.py(2391): forward
/Users/m/venvs/pytorch-coremltools-transformers/lib/python3.9/site-packages/torch/nn/modules/module.py(1090): _slow_forward
/Users/m/venvs/pytorch-coremltools-transformers/lib/python3.9/site-packages/torch/nn/modules/module.py(1102): _call_impl
/Users/m/venvs/pytorch-coremltools-transformers/lib/python3.9/site-packages/transformers/models/bigbird_pegasus/modeling_bigbird_pegasus.py(2520): forward
/Users/m/venvs/pytorch-coremltools-transformers/lib/python3.9/site-packages/torch/nn/modules/module.py(1090): _slow_forward
/Users/m/venvs/pytorch-coremltools-transformers/lib/python3.9/site-packages/torch/nn/modules/module.py(1102): _call_impl
/Users/m/venvs/pytorch-coremltools-transformers/lib/python3.9/site-packages/torch/jit/trace.py(958): trace_module
/Users/m/venvs/pytorch-coremltools-transformers/lib/python3.9/site-packages/torch/jit/trace.py(741): trace
/Users/m/Open Source/bigbird-pegasus-large-pubmed/torch_to_coreml.py(122):
RuntimeError: set_storage_offset is not allowed on a Tensor created from .data or .detach().
If your intent is to change the metadata of a Tensor (such as sizes / strides / storage / storage_offset)
without autograd tracking the change, remove the .data / .detach() call and wrap the change in awith torch.no_grad():
block.
For example, change:
x.data.set(y)
to:
with torch.no_grad():
x.set(y)