Iām Thomas Dehaene, and I work as an ML Engineer at a company called ML6.
In my day-to-day job, I work a lot with Dutch NLP related topics of all sorts (summarization, entity extraction, text-generation, general text data analysis, etc.). In general Iām super interested in all things multilingual NLP!
Some recent work I helped on (together with the team) involved:
A small NLP analysis on some political documents (link)
I am Jordy Van Landeghem, AI/NLP researcher at Contract.fit, and industrial Ph.D. student at the Catholic University Leuven (Belgium).
My research is focused on Document Understanding tasks (text classification, NER, structured prediction) and how to calibrate predictions (prediction probability ~= correctness), so that we can better rely on finetuned models out-of-the-box.
Always up for a discussion on how to advance the state-of-the-art. You can find me on:
It would be cool if this thread could lead to some collaborations
I still believe there is a āmissingā model in the model hub, namely a Belgian Transformer-based model, which takes into account the implicit language distribution of Dutch, French, and German (to a lesser degree), while still performing well on the lingua franca English.
I am Pieter Delobelle, a PhD researcher at DTAI lab at the KU Leuven (Belgium). My research is mostly focussed on fairness and bias in NLP, so for example how to quantify and mitigate stereotypes in language models.
One example of this is in the bias analysis (pdf, page 7) of RobBERT, a Dutch state-of-the-art RoBERTA-based LM that we released last year at EMNLP. Internally, we are using this model for processing resumes and vacancies. We are also using RobBERT to monitor tweets in Belgium, so a Belgian model would be very useful!
If you have any questions, e.g. about RobBERT or fairness in NLP, you can always find me on:
I am Jens-Joris Decorte, currently working as an NLP Research Engineer at TechWolf in the form of a Baekeland PhD in collaboration with the Text-to-Knowledge (T2K) research group of the University of Ghent.
I just recently kicked off the project and my research is initially focused on industrial applications of language models: feasible training processes on niche domain data, interpretability, and making them more structured (e.g. through combining them with knowledge bases). Currently, Iām working on a method to efficiently fine-tune language models on a niche corpus. Iām looking forward to any kind of discussion on NLP, especially on interpretability in NLP.
Thanks for starting this, @thomasdehaene! This topic almost feels like a virtual Belgium NLP Meetup, so I donāt want to be left out
Iām Yves Peirsman, NLPer with a PhD from the University of Leuven and a keen interest in all things language and technology. My company NLP Town helps organizations implement NLP solutions ā through consultancy, software development, or a combination of both. Weāve also developed our own labelling tool that helps annotators label text data more effectively.
In the last five years, weāve worked with many companies, big and small, in a wide range of sectors: medical, legal, financial, HR, education, etc. As weāre based in Belgium, Dutch is one of the languages we work on most, together with many other Western European languages.
Finally, Iām also the organizer of the Belgium NLP Meetups, which I hope will resume after summer.
Iām always open to discussing anything NLP-related, so feel free to contact me through one of these channels:
Iām Niels, Iāve studied business and information systems engineering (Handelsingenieur Beleidsinformatica) at KU Leuven, but as I got to know Natural Language Processing during my master and wanted to dive more deeply into technical programming, Iām currently working as an Applied AI Researcher at Howest, where I work on several VLAIO-supported AI projects.
You might have seen the TAPAS algorithm appearing on the social media of HuggingFace - I was the contributor of that model last year I challenged myself, would it be possible to re-implement an algorithm myself? So thatās what I did, I started reading the original TAPAS repo from Google AI (which was written in TF1), and then slowly but surely started implementing it in PyTorch. After a while, I considered my implementation well enough, and then opened up a pull request on the Transformers repo. After a few more weeks - while working closely together with some of the top developers at HuggingFace - my pull request got merged! Google sent me a small surprise package to thank me for the achievement. Itās definitely something I recommend to anyone, itās a great learning experience (not only NLP-related, also regarding writing qualitative code and so on, formatting code, etc.)! Also, this week, TAPAS is featured on the homepage of HuggingFaceās model hub - really cool!
Also, Iām now part of the core contributors team of HuggingFace after TAPAS, I helped improving other models, such as Microsoftās LayoutLM (which you can use to classify scanned documents or extract information from them), and Iām currently working on adding several other models to the library. My main interests are just getting to know state-of-the-art algorithms (mostly Transformers - as they are conquering everything right now), and making them available for anyone to use, which is of course totally in lign with HuggingFaceās mission.
Fun fact - Iāve spoken to Thomas and Jordy before at a job fair in Ghent and Yves at an ethical AI conference in Brussels, and I know TechWolf of course - I was in the same high school as Andreas De Neve and we took a mathematics seminar together - send him my regards Jens-Joris! I guess NLP is a small world Iām happy to connect to all of you.
Hi all, great to see some familiar faces around here!
Iām Thomas Winters, a PhD student researching computational humor and creative artificial intelligence at the DTAI research group (KU Leuven, Belgium). I mainly focused on symbolic approaches, but now some more on transformer models as well. More specifically, some Dutch NLP projects I worked on are:
More Dutch Twitterbots than I care to admit (although most are listed here).
Pieter and I created RobBERT, the state-of-the-art Dutch BERT model.
We used this RobBERT model to show that BERT models are drastically better at humor detection than previous types of language models. By generating ābrokenā jokes with the same structure & vocabulary as jokes, we showed that while other language models like LSTM and CNN could not distinguish both types at all (less accuracy than random guessing), the RobBERT model still got ~90% accuracy.
Helped built technologies for Improbotics Flanders, a show where we play improv theatre with a GPT-2-powered robot.
Designed Gitta, a template-powered grammar induction algorithm for creating interpretable generative models.
Right now, doing research on combining neural and symbolic methods for NLP.
Iām Karel DāOosterlinck, a computer science engineering student at Ghent University, currently in my final year.
I had my first experience with NLP back in 2018, when I was working on a Twitter sentimental analysis platform for my bachelorās thesis. Looking back, Iām very glad I had this first experience with NLP.
Currently, Iām doing a masterās thesis on using multilingual models for low-resource languages (at the T2K research group). Specifically, Iām exploring the power of multilingual models for Dutch coreference resolution and how multilingual models can leverage high-resource coreference data to finetune on a low-resource coreference task. I would also like to thank @thomaswint and @pdelobelle for building RobBERT, Iāve already had a lot of fun with this model .
Iām looking forward to further specialize in NLP after my masterās thesis. Maybe, if there is enough enthusiasm, we could organize a monthly (or every 2 months) event to discuss (Dutch) NLP?
Iām Nithin, currently working as a Machine Learning Engineer at an Amsterdam-based startup called Amberscript. Last year, I graduated with a masters in AI from the University of Amsterdam.
My level of Dutch is elementary, but Iām working on improving Dutch ASR as part of my job. During my masters, I developed a meta-learning based approach to few-shot word sense disambiguation, where the goal is to learn to disambiguate new words with just a handful of examples. Furthermore, I worked on continual learning and showed that meta-learning methods mitigate catastrophic forgetting and result in an efficient form of continual learning. Iām interested in diverse topics in NLP, and looking to explore more on end-to-end ASR systems.
Iām Sofie and Iāve been working in NLP since my masterās Thesis back in 2006-2007 at UGent. Since, Iāve done a PhD in BioNLP, worked for some bigger and smaller companies, and am now Lead ML Engineer of the open-source NLP library spaCy.
While Dutch is my native language, Iāve mainly been working on English use-cases in the past, but am definitely interested in Dutch applications and making sure spaCy supports Dutch well.
Frederik Durant is my name, fdurant my Hugging Face handle. Iām originally from Brussels, and Dutch is my mother tongue. Hence the interest.
Iāve been dabbling and later working in (and occasionally out of) NLP since 1990 - yes thatās not a typo. Having been around for so long, Iāve witnessed the evolution from rule-based to statistical to neural approaches in industry first hand. Itās amazing how the field has evolved from a highly specialized academic discipline to an installable library where you can get crazy stuff done in 10 lines of code.
My interested has always been in building applications based on (among other things) NLP, rather than on the core discipline itself. Consider me a friendly integrator.
Professionally, I currently work as a freelance ML engineer on chatbots in the banking sector, but I canāt say much about that for confidentiality reasons.
Iām here to learn first, and also to contribute as soon as I can.
You guessed it, my name is Bram. Did masters in computational linguistics and one in AI (both KU Leuven), and received my PhD this year (Ghent University). Iāve mostly focused on the intersection of computational/psycho- linguistics, human/machine translation, and broader NLP. For the time being I am a post-doc at Ghent University with a focus on human and machine translation.
It is an obvious suggestion that I use transformers in work on MT. However, I have also been using and training models for very niche tasks, e.g. humor detection. Most recently I employed RoBERTa to predict for tokens how difficult they would be to translated. āDifficultyā here is operationalized as normalized translation duration, and is heavily inspired by psycholinguistic studies. Iāve used Transformer models to train my own spaCy-backed Universal Dependencies models, too - which is not necessary anymore as they are provided out of the box these days. Awesome! Oh, and I am a big Tolkien-nerd so Iāve done some experiments with training a GPT-J-like language model on all the collected work of Tolkien
Been using transformers since the pytorch_pretrained_bert days and Iāve been trying my best to contribute and help when I can ever since! (In other NLP-related tools as well.)
I think Iām already in touch with most of you, but feel free to shoot me a message or connect on other platforms!
Good afternoon NLP bazen, I am Bram as well, but from the UMC in Utrecht ;). We are mostly working on/interested in clinical language modeling, from NER/linking to extraction of diagnoses from EHRās.
This might be of interest to the Dutch/Belgian NLP folks: weāre hosting the Hugging Face webinar for ML Demo.cratization in Belgium tomorrow (30/6/2022). It is an online event that everyone can attend online via Teams.
You can find more information on the website, and attendance links in the second paragraph.