Pretrain GPT-J-6B from scratch on Arabic

The idea behind this project is to pretrain a GPT-J-6B on an Arabic corpus using JAX and TPUs of course.

Some resources:


@ yalouini

Interesting how to particpate

1 Like

I think that your interest has been noted. We will see how projects are created.

1 Like

Here are more details:

Arabic GPT-J-6B

Train a GPT-J-6B on Arabic dataset (Arabic Wikipedia for exmple) using Jax on TPUs.

2. Language

The model will be trained in Arabic.

3. Model


4. Datasets

Arabic Wikipedia.
(TODO: more to come)

5. Training scripts

TODO: Update

6. Challenges

No prior experience doing this kind of work, so this will be mostly a discovery.

7. Desired project outcome

An Arabic GPT-J model that is available on HuggingFace.

8. Reads

I like this project! Will note it down :slight_smile: Maybe we’ll find some more people later on. Note that this project requires you guys to use a non-transformer model definition: GitHub - kingoflolz/mesh-transformer-jax: Model parallel transformers in JAX and Haiku and model-paralellism in JAX (given the size of the model). Nevertheless I think it’s a feasible project!

Noting it down :slight_smile:

1 Like

Thanks @patrickvonplaten for the additional details and links!

Some additional links:

Why the work on this stopped ?
I have seen an repo in Hugging Face with the name arabic-GPT-J-6b and it was empty.

I hope the idea is still work on progress

Hello @MohamedRashad.

For many reasons among which personal reasons but hopefully I will give it more time later this year.

Notice that the github repo isn’t empty. For instance this branch: GitHub - yassineAlouini/arabic-gpt-j-6b at fine-tune.

Hopefully will have time later but for now it is on hold for sure.