Modeling bart JAX vs Pytorch/Tensorflow implementation

smallsuper · September 11, 2023, 4:15am

I am comparing the implementation of Bart in JAX vs PyTorch/TensorFlow. I realized causal masking is not done in the JAX implementation compared to PyTorch and TensorFlow. Is there a reason for this?

Topic		Replies	Views
How to instantiate Bart Decoder in a non causal way - PyTorch 🤗Transformers	0	155	September 11, 2023
Which Deep Learning Framework Should I Choose: TensorFlow, PyTorch, or JAX? Beginners	2	75	June 25, 2025
PreTrain BART on The Pile Flax/JAX Projects	19	1636	July 1, 2021
Is Jax faster than Pytorch XLA? 🤗Accelerate	1	389	April 15, 2024
Is there any pretraining script for BART? 🤗Transformers	0	1217	August 14, 2020

Modeling bart JAX vs Pytorch/Tensorflow implementation

Related topics