TAPAS inference time memory utilization

I’m playing around with a fairly small table with about 10 rows and 12 columns and am noticing the memory used jumps to 1.2 - 1.4 GB and then when I increase it to 50 rows it jumps to 10-14 GB. I’m not using any GPUs its all running on CPUs. I’m using the base wtq model and I don’t want to go down to smaller models to solve this problem. I can live with a slower response time but use less RAM.

Any guidance on what I can optimize to make efficient use of memory?

Best
Casa

1 Like