French NLP - Introduction đŸ‡«đŸ‡·

Bonjour Ă  tous! :hugs: :fr: :croissant:

This is the introduction thread for French NLP enthusiasts. Let’s share on experiences, models, datasets, research, 
 or any topic related to NLP and la langue de Moliùre!
To get started, please introduce yourself with any of the following:

  • Your name, Github, Hugging Face, and/or Twitter handle
  • Some projects you are working on or interested in starting
  • Any potential directions you have in mind for the Hugging Face Italian community
  • Anything else you’d like to share!

E.g. I’m Thomas Vrancken (ThomasVrancken on Github, @VranckenThomas on Twitter - to make it original :stuck_out_tongue:). I’m an MSc in AI graduate (Maastricht University), and I have experience in NLP research (Philips R&D), data science consulting (YGroup) and now I’m in the coolest ML consulting firm in Belgium - ML6!
I’m absolutely passionated about ML and NLP and pretty active in our NLP chapter. We do a lot of nice stuff, feel free to check our bi-weekly tips-of-the-week (we’ve been doing it internally for a while and recently started going public with some, more to come soon!) and trained non-English transformers models (currently in Dutch/German but FR on the backlog!).

Looking forward to meet more FR-NLP enthusiasts!

3 Likes

Haha with all of us French people working at Hugging Face you’d think we would have started this thread earlier! “Les cordonniers sont les plus mal chaussĂ©s” :wink:

Welcome and thanks for starting this group!

1 Like

Bonjour Ă  vous !
Hi guyz, thanks for the initiative !
MĂȘme si la computer vision reste ma prĂ©fĂ©rĂ©e, le nlp n’est pas mal non plus :wink:
Welcome to the newcomers!
Cheers.

Hi guys,

I am working in R&D department in a manufacturing company.
Working on nlp for different kind of tasks.

embeddings, automated translations 


Greetings from France 


Bonjour !

Je suis Ă©tudiant en France et je m’intĂ©resse beaucoup aux applications des LLM dans le domaine de la santĂ© ainsi que dans la gestion des bases de connaissances, les deux impliquant un certain niveau de confidentialitĂ©.

Mon objectif actuel est de rendre mes notes personnelles (cours, journal quotidien, notes de livres, etc.) disponible Ă  un LLM afin de pouvoir “discuter” avec elles : faire des recherches en langage naturel et non par mot-clĂ©s, faire des recherches de similitudes afin d’essayer de trouver des connexions lĂ  oĂč je n’en vois pas spontanĂ©ment, etc.
Apparemment, la technique la plus appropriée pour cela est le Retrieval-Augmented Generation (RAG)
(corrigez moi si je me trompe :wink: )

J’ai testĂ© plusieurs LLM Open Source et j’ai l’impression qu’aucun ne maĂźtrise vraiment la langue française. Certains ne parlent français que sous la contrainte et loin de moi l’envie de les faire souffrir :pleading_face:
Je vais donc me tourner vers Vigogne que je n’avais pas considĂ©rĂ© pour l’instant.

Quelqu’un l’a-t-il dĂ©jĂ  essayĂ© ? Qu’en pensez-vous ?

Au plaisir de vous lire :smiley:


Hello !

I’m a student in France and I’m very interested in LLM applications in the healthcare field as well as in knowledge base management, both implying a certain level of confidentiality.

My current goal is to make my personal notes (lectures, daily journal, book notes, etc.) available to an LLM so that I can “discuss” with them: search in natural language and not by keywords, do similarity searches to try to find connections where I don’t spontaneously see any, etc.
Apparently, the most appropriate technique for this is Retrieval-Augmented Generation (RAG).
(correct me if I’m wrong :wink: )

I’ve tested several Open Source LLMs and I get the impression that none of them really master the French language. Some only speak French under duress and far be it from me to make them suffer :pleading_face:
So I’m going to turn to Vigogne which I hadn’t considered for the moment.

Has anyone tried it? What do you think of it?

Looking forward to hearing from you :smiley:

1 Like