Wav2vec2 for dementia screening based on spontaneous speech

shreyasgite · June 28, 2021, 7:51pm

wav2vec2 for dementia screening

Dementia is hard to diagnose. And there is no known cure, maybe because it too late by the time the diagnosis is confirmed. It is understood that the development of dementia commences 10 to 15 years before the symptoms first appear. Also, language, especially spontaneous speech, is a promising indicator/biomarker for diagnosing dementia and other cognitive disorders.

2. Language

The model will be trained on English audio.

3. Model

As far as I know, wav2vec2 is the best candidate.

4. Datasets

There is dementia classification from dementia bank. However, it is a binary classification dataset for dementia and no dementia.
To predict dementia 10 to 15 before the onslaught of symptoms, we would need longitudinal data on individuals who develop dementia. The only such available datasets I know is the Framingham heart study, and it is text only.
I have been building a list of public figures diagnosed with dementia and scrapping videos of youtube into different categories; after symptoms, two years before symptoms, five years before symptoms etc. Over the next week, I will build a streamlit app to extract 8 to 10 secs of audio files of the person of interest from the video.
Maybe we can use data from dementia bank or other sources for no dementia class??

Possible links to publicly available datasets include:

Dementia Bank
Google sheet with list of public figures with dementia and YouTube urls

5. Training scripts

TODO

6. Challenges

The dataset is too small, too noisy?
No dataset on time for no dementia

7. Desired project outcome

A proof of concept streamlit app that this works?

8. Reads

The following links can be helpful to better understand the project and
what has previously been done.

IBM efforts using Farmingham data for dementia prediction

patrickvonplaten · June 29, 2021, 3:45pm

Really like this project! Hope more people will be interested

shreyasgite · June 29, 2021, 6:28pm

I am hoping for the same

cahya · June 29, 2021, 8:29pm

Actually I am interested to this project, unfortunately I am already in other projects. Maybe next time

gkuwanto · June 30, 2021, 7:07am

Hi! Are you still looking for people to work with? I’m very interested in doing this project

shreyasgite · June 30, 2021, 8:06am

Yup, we are still open for people

gkuwanto · June 30, 2021, 10:06am

Cool can I join?

patrickvonplaten · June 30, 2021, 12:19pm

Awesome two people is enough - let’s create the project!

Think this is a really cool & interesting project and we should be able to demo it well! Here I think it’ll actually make most sense to fine-tune a pretrained Wav2Vec2 model, maybe this one: facebook/wav2vec2-base-100k-voxpopuli · Hugging Face ?

There is currently no official fine-tuning script for Wav2Vec2, but it should be relatively easy to adapt the pretraining script: https://github.com/huggingface/transformers/pull/12271 or the official one in PyTorch.

Let’s create the project - very excited about this one

mattbui · July 1, 2021, 2:55am

@shreyasgite cool project, let me know if I can give a hand. Is there a channel on discord for this project?

patrickvonplaten · July 1, 2021, 10:05am

Great, added you to the official project @mattbui

birgermoell · July 1, 2021, 10:43am

I would love to be a part of this project. I actually used wav2vec2 embeddings for participating in a Alzheimer dementia challenge. The wav2vec2 embeddings didn’t work so well for the task but this might be because there wasn’t enough data. I’m more then happy to try this out again.

birgermoell · July 1, 2021, 10:55am

I made a discord channel

shreyasgite · July 1, 2021, 12:04pm

This is awesome @birgermoell and @mattbui

shreyasgite · July 1, 2021, 12:18pm

Thanks Partick for setting the vector:)

asharma85 · July 8, 2021, 2:13am

Hey @shreyasgite I would love to join and contribute to the project.
If it is late can I follow the group’s progress. Really want to get going with a small project in the area

How is the group coordinating?
Thanks

shreyasgite · July 8, 2021, 3:01pm

@asharma85 You can join the channel on Discord

asharma85 · July 8, 2021, 3:27pm

It keeps showing me this when I try to join on discord

johnpaulbin · July 8, 2021, 6:47pm

also @shreyasgite , make sure you right click on your server icon → invite and copy the link provided. You cannot access a discord server via the link in your browser.

shreyasgite · July 8, 2021, 7:34pm

@asharma85 Flax-HuggingFace-Community-Week

junxtjx · August 5, 2021, 8:55pm

Hi @shreyasgite, if this project is still on-going I am also quite interested in it. The invite link has expired though.

Topic		Replies	Views
Wav2Vec2 for Audio Emotion Classification 🤗Transformers	6	8169	May 26, 2021
Using Wav2Vec in speech classification/regression problems Languages at Hugging Face	13	9583	November 16, 2022
Wav2vec For Music Applications (generation, captioning, instrument classification) Flax/JAX Projects	2	1502	July 3, 2021
PreTrain Wav2Vec2 in German Flax/JAX Projects	7	1364	July 7, 2021
Bemba ASR: Fine-Tuning Wav2Vec2 Languages at Hugging Face	4	748	March 21, 2021