PreTrain Electra/T5 for Korean from scratch

PreTrain Electra/T5 for Korean from scratch

This project will be train the pretrained-model (Electra, T5, …) from Korean corpus called as a 모두의말뭉치.

2. Language

The model will be trained in Korean.

3. Model

  • Electra
  • T5

4. Datasets

모두의말뭉치

5. Training scripts

We can make use of example to train the model.>

Addtionaly, we will be make the train codes origin from the transformers example

7. Desired project outcome

A pretrained weights. After, fine-tune both text classification and text summarization

4 Likes

I’m interested in pretrained korean model, too! I wanna join in. ol

1 Like

Awesome, finalizing you guys!

Good! I’d like to join this. :slight_smile: