Indonesian NLP - Introductions

Hello there! This is the introduction thread for Indonesian NLP enthusiasts.

Welcome, and please feel free to introduce yourself with any of the following:

  • Your name, Github, Hugging Face, and/or Twitter handle
  • Your interest in Indonesian NLP
  • Some projects you are working on or interested in starting
  • Any other languages that you speak, any personal interests, or anything else

Hi there Mas @cahya, thanks for initiating this thread!

I’m Wilson Wongso (@w11wo in Github and HF), a year-2 Computer Science undergraduate student from Jakarta, Indonesia. I’m still relatively new and am still learning NLP/Hugging Face, but am having fun thus far!

I trained some small language models with GPT-2 and RoBERTa recently as my side project during semester break and am interested to create language models for native Indonesian languages like Javanese, Sundanese, Medanese, etc.

Looking forward to connecting with the community here!


Hi, my name is Cahya. I work as a system and software engineer in Vienna, Austria. My interest in ML / NLP started in early 2017 with a simple text classification with Tensorflow.

Currently I like to experiment with Conversational AI, Open Domain Question Answering, and Text Summarization. I built some Indonesian language models which are hosted here and helped to put some existing Indonesian NLP datasets to the collection of Hf datasets.

I hope we could connect and work together on interesting Indonesian nlp projects. One of the projects I would like to try is creating an MBART model with a collection of some of existing languages ​​in Indonesia (at least the 15 most used one) such as Javanese, Sundanese or Minangkabau. This could later be used for machine translation among these languages ​​or other seq2seq tasks.

My handles:


Hello guys, my name is warto from IAIN Purwokerto Indonesia. My interest on NLP and text mining. I am doctoral student at Dian Nuswantoro University Semarang. My research topic about information extraction.
I just finished annotate Indonesian news with covid19 topic


Hi, My name is Akmal. My Huggingface, GitHub username is Wikidepia.

I have zero background on NLP / Machine learning :sweat_smile:
Currently i am interested in creating Indonesia transformer models like T5 and GPT-2. Thanks to TFRC :smiley: Also translating english dataset like PAWS.

Nice to meet you all!