"Seeking Advice: Storing High-Volume Sign Language Dataset for ChatDEAF"

Hello Hugging Face Community,

I’m currently working on a project called ChatDEAF, an AI-based system designed to understand and support sign language communication — especially for deaf mothers and hearing children.

One of our major challenges is storing massive sign language data, where each “word” includes multiple data points:

handshape

position

facial expression

movement

speed & rhythm

This makes sign language datasets 5–10x larger than normal text-based ones.

We’re concerned that normal SSD or cloud storage may be insufficient in the long run.

What would you recommend for:

Efficient video + gesture data storage

Long-term backup

Integration with AI model training

Should we explore AWS S3, Azure Blob, Backblaze, or custom solutions?

We’d love to hear your thoughts — especially from those experienced with video + motion dataset storage.

Thank you for your support!

— Yasin Şimşek
ChatDEAF Founder & Deaf Data Architect

1 Like