Online training options

Hi,

I’m not sure to understand correctly the options to train a model online with Hugging Face.

The most obvious option is AutoTrain. However this is not an option for us as Instance Segmentation is not supported.

I’ve got suggestions about using spaces. However in the documentation this does not seems to be something officially supported. Maybe more a hack?

I’ve also got suggestions about using the Inference API. It does not seems intuitive as it is called “Inference” but, after all, it supports GPU.

The most obvious option I see in the documentation would be to use Amazon Sagemaker integration. It’s a bit of overload but it maybe a reliable option.

What is your process in such cases ?

Regards

2 Likes

If you want to do it for free, you can use Hugging Face’s CPU space, but I think the more common method is to use Google Colab Free. The Hugging Face course also uses that method…

Thanks, we have enterprise HF and we are ok to spend money. Also, the training time will exceed Google Collab max up time. So I guess the solution is to use Amazon Sagemaker integration.

1 Like

Then there are a lot of options. Trainer works in most environments as long as you have Python 3.10 and PyTorch, so I think you can run it on most cloud services other than SageMaker.
For example, you could create a dummy GUI in Hugging Face Spaces and use that to train. With general services, you don’t even need a GUI.

Well, when it comes to stability, I’m not sure.:sweat_smile: Trainer has an option to save checkpoints frequently, so if it crashes, it might be quicker to devise a way to resume from there.

Resume

For cost efficient training

2 Likes

Thanks a lot, we will try integration with SageMaker

1 Like