Train Dutch FlaxBigBird

FlaxBigBird for Dutch language

Currently, there are only a very limited amount of long-range sequence models for other languages than English on the hub. The goal of this project is to create a strong Dutch FlaxBigBird model.

Model

A randomly initialized FlaxBigBird model.

Datasets

CC-100
mC4

Available training scripts

Scripts to pre-train Flax models are here: transformers/examples/flax/language-modeling at master · huggingface/transformers · GitHub

Scripts for fine-tuning and evaluation for FlaxBigBird just got released here: transformers/examples/research_projects/jax-projects/big_bird at master · huggingface/transformers · GitHub Thanks to @vasudevgupta

(Optional) Desired project outcome

The desired project output is a strong Dutch FlaxBigBird model in Dutch. For downstream tasks that require long sequences (e.g. long text classification).

(Optional) Challenges

4 Likes

Agree it would be great to have more long-range sequence models for Dutch! We don’t have Longformer implemented in Jax/Flax - maybe FlaxBigBird could be an option?

1 Like

Yes even better! Thanks for the suggestion, I will change it.

1 Like

Let’s try to get some participants for this project - think it’s a great idea :slight_smile:

I could help a bit here and there. I have experience with getting and preprocessing texts, comfortable in the shell and python. Knowledge of transformer architectures from a users perspective. Unfortunately missed the flax/jax intro’s.

2 Likes

Wuhu, let’s create it!

2 Likes

I’ll join! Sounds like a fun project.

1 Like