How many GB of RAM do I need to train DBRX?

I heard it required 320 GB of RAM. However, if I was to quantize (if that is possible) would I be able to reduce it below 100 GB?

In addition how would I store this model so I can keep my progress? Would I need to download it somehow to my laptop?

Methods and tools for efficient training on a single GPU (

Quantisation will help as well. However, Exhaust the optimisations in the link above. FP16, Gradient Accumulation, Gradient Checkpointing and PEFT are very good at cutting down on VRAM usage.

The 4-bit version requires about 70GB of RAM.