Can we have some high memory CPU instance choices?

I’m trying to cheaply test the Mixtral 8x7B, but it doesn’t fit on the small CPU only instances offered on Inference Endpoints or Spaces.

Sparse Mixture of Experts models seem like they would work well with high memory instances such as:

  • AWS x2gd.2xlarge (8 vCPUs, 128G RAM for ~$0.67/hr)
  • GCP e2-highmem-16 (16 vCPUs, 128G RAM for ~$0.72/hr)

Can we have access to some heftier CPU-only machines?