Disable XLA for T5 fine tuning using Tensorflow on M1 Mac

Hi everyone

How can I turn off XLA for TensorFlow on M1 Mac as there doesn’t seem to be support for it yet (using tensorflow-macos 2.9.2 as it is the only version that works at all; transformers 4.26.0.dev0)?

  File "/Users/.../miniconda3/envs/ksd_new/lib/python3.10/site-packages/transformers/keras_callbacks.py", line 228, in on_epoch_end
    predictions = self.model.generate(generation_inputs, attention_mask=attention_mask)
  File "/Users/.../miniconda3/envs/ksd_new/lib/python3.10/site-packages/transformers/generation_tf_utils.py", line 600, in generate
    return self._generate(
  File "/Users/.../miniconda3/envs/ksd_new/lib/python3.10/site-packages/transformers/generation_tf_utils.py", line 1756, in _generate
    return self.greedy_search(
  File "/Users/.../miniconda3/envs/ksd_new/lib/python3.10/site-packages/transformers/generation_tf_utils.py", line 2405, in greedy_search
    generated, _, cur_len, _ = tf.while_loop(
  File "/Users/.../miniconda3/envs/ksd_new/lib/python3.10/site-packages/transformers/generation_tf_utils.py", line 2331, in greedy_search_body_fn
    model_outputs = self(
  File "/Users/.../miniconda3/envs/ksd_new/lib/python3.10/site-packages/transformers/modeling_tf_utils.py", line 420, in run_call_with_unpacked_inputs
    return func(self, **unpacked_inputs)
  File "/Users/.../miniconda3/envs/ksd_new/lib/python3.10/site-packages/transformers/models/t5/modeling_tf_t5.py", line 1414, in call
    decoder_outputs = self.decoder(
  File "/Users/.../miniconda3/envs/ksd_new/lib/python3.10/site-packages/transformers/modeling_tf_utils.py", line 420, in run_call_with_unpacked_inputs
    return func(self, **unpacked_inputs)
  File "/Users/.../miniconda3/envs/ksd_new/lib/python3.10/site-packages/transformers/models/t5/modeling_tf_t5.py", line 789, in call
    layer_outputs = layer_module(
  File "/Users/.../miniconda3/envs/ksd_new/lib/python3.10/site-packages/transformers/models/t5/modeling_tf_t5.py", line 568, in call
    self_attention_outputs = self.layer[0](
  File "/Users/.../miniconda3/envs/ksd_new/lib/python3.10/site-packages/transformers/models/t5/modeling_tf_t5.py", line 457, in call
    attention_output = self.SelfAttention(
  File "/Users/.../miniconda3/envs/ksd_new/lib/python3.10/site-packages/transformers/models/t5/modeling_tf_t5.py", line 396, in call
    position_bias = dynamic_slice(
  File "/Users/.../miniconda3/envs/ksd_new/lib/python3.10/site-packages/tensorflow/compiler/tf2xla/ops/gen_xla_ops.py", line 1040, in xla_dynamic_slice
    return xla_dynamic_slice_eager_fallback(
  File "/Users/.../miniconda3/envs/ksd_new/lib/python3.10/site-packages/tensorflow/compiler/tf2xla/ops/gen_xla_ops.py", line 1092, in xla_dynamic_slice_eager_fallback
    _result = _execute.execute(b"XlaDynamicSlice", 1, inputs=_inputs_flat,
tensorflow.python.framework.errors_impl.UnimplementedError: Exception encountered when calling layer "SelfAttention" (type TFT5Attention).

Could not find compiler for platform METAL: NOT_FOUND: could not find registered compiler for platform METAL -- check target linkage [Op:XlaDynamicSlice]

Call arguments received by layer "SelfAttention" (type TFT5Attention):
  • hidden_states=tf.Tensor(shape=(2, 1, 512), dtype=float32)
  • mask=tf.Tensor(shape=(2, 1, 1, 2), dtype=float32)
  • key_value_states=None
  • position_bias=None
  • past_key_value=('tf.Tensor(shape=(2, 6, 1, 64), dtype=float32)', 'tf.Tensor(shape=(2, 6, 1, 64), dtype=float32)')
  • layer_head_mask=None
  • query_length=None
  • use_cache=True
  • training=False
  • output_attentions=False

I’m using the transformers.KerasMetricCallback with use_xla_generation=False.

Thank you for your help.

Did you get a solution for this? I am also facing the same issue on M1 mac.

Unfortunately not yet.

Hey @gargantua42 @kirtinikamdynamic_slice, which is the OP that is causing the exception, is needed for XLA generation.

My suggested workaround would be to install the version of transformers before XLA generation was introduced (v4.20). If you need features from newer versions, then my suggestion would be to fork transformers, and replace the contents of TFT5Attention with v4.20’s TFT5Attention (I can’t guarantee that it will work well, but it should).

I hope this helps :slight_smile:

Hi @joaogante

Thanks for this idea! I solved it by downgrading the transformers library to version 4.20.1 as XLA was introduced in version 4.21.0 (Release v4.21.0: TF XLA text generation - Custom Pipelines - OwlViT, NLLB, MobileViT, Nezha, GroupViT, MVP, CodeGen, UL2 · huggingface/transformers · GitHub) - as you pointed out. Seems to be working fine for now.

It would be great though if we could enable or disable XLA using a flag or something - for those of us who use a M1/M2 Mac with no XLA support yet…

1 Like

@gargantua42 added to my todo list :slight_smile:

1 Like

Hello there!
Thanks @gargantua42 por posting this and @joaogante for the response. :slightly_smiling_face:

Im finding an issue that I think is related to this in the reproducibility on one of the transformers repository example related to Translation.

Packages

transformers==4.31.0.dev0
tensorflow mac-os==2.9
tensorflow metal==0.5

Managed to setup the training correctly and it does train, and rgis is the KerasCallback

            metric_callback = KerasMetricCallback(
                metric_fn=compute_metrics,
                eval_dataset=tf_eval_dataset,
                predict_with_generate=True,
                use_xla_generation=False,
                generate_kwargs=gen_kwargs,
            )

After training it shows the error

Could not find compiler for platform METAL: NOT_FOUND: could not find registered compiler for platform METAL -- check target linkage [Op:XlaDynamicSlice]

Call arguments received by layer "SelfAttention" (type TFT5Attention):
  • hidden_states=tf.Tensor(shape=(24, 1, 512), dtype=float32)
  • mask=tf.Tensor(shape=(24, 1, 1, 2), dtype=float32)
  • key_value_states=None
  • position_bias=None
  • past_key_value=('tf.Tensor(shape=(24, 8, 1, 64), dtype=float32)', 'tf.Tensor(shape=(24, 8, 1, 64), dtype=float32)')
  • layer_head_mask=None
  • query_length=None
  • use_cache=True
  • training=False
  • output_attentions=False

I got to rollback to transformers==4.20 but other errors arised in the example related to tokenization so before starting changing all the example hust passed to ask :slight_smile:

Some time has passed, so Im assuming that something new might arised that Im missing? I did some search in the releases page but couldn´t find anything xla-disable things :frowning: :slight_smile:
Maybe @joaogante can enlighten me? Thanks in advance!

Im using the example for PyCon Spain keynote and installing from source and change things might make the example harder to reproduce, is still replacing TFT5Attention with v4.20’s TFT5Attention the optimal soluction available?
Thanks for your help! I really appreciate it!
Please let me know if at some point this shall entail. that in the example some comment might arise regarding mac reproducibility …

Thanks again for posting this !