I wanted to do some text classification task using tensorflow so I did
model = TFBertForSequenceClassification.from_pretrained("bert-base-uncased")
but then I immediately checked memory usage (before the above operation there were only 2 MB usage):
| 0 NVIDIA GeForce ... On | 00000000:01:00.0 Off | N/A |
| 30% 53C P2 128W / 350W | 22999MiB / 24265MiB | 0% Default |
Is this normal that it is taking like 20GB of space? I looked up online and it seems that those simple bert models should take only around a few GB? This makes it very prohibitive to train with a large batch size (I wanted to use BATCH_SIZE=64
). Any idea why this happens/how to ask tf not to reserve so much memory?