Connecting to hub account from Databricks

I’m running transformers in a Databricks notebook with a local dataset. the task is text classification with BERT and DistilBERT. I have no problem loading public checkpoints from the hub and fine-tuning. The problem comes when I want to push the model back to my account on the hub. The cell

from huggingface_hub import notebook_login

doesn’t allow me to log in. Figuring I need to display HTML I tried


but this gives me a java NullPointerException. Has anybody succeeded in connecting to their account from a DataBricks notebook?


Hi Alun!

Today I ran into the same challenge. Hugging Face unfortunately seems to lack proper documentation on how to login into the Hugging Face Hub from within a Databricks notebook. While I was unsuccessful in logging in through the currently documented huggingface-cli login, notebook_login() and HfApi.set_access_token(), I was successful in logging into and pushing models and datasets to the Hugging Face Hub through a hacky implementation of what is supposed to be or become a deprecated method.

Start with installing Hugging Face Hub through:
%pip install huggingface_hub

And installing git-lfs through:

curl -s | sudo bash
sudo apt-get install git-lfs

Then, continue with logging in through:

from huggingface_hub.commands.user import _login
from huggingface_hub import HfApi

api = HfApi()
_login(hf_api = api, username = "USER", password = "PASS")


git config --global credential.helper store
git config --global "EMAIL"
git config --global "FULLNAME"

This worked for me and took quite some trial and error to figure out. Pushing models still behaves a bit weird due to some occasionally occuring git errors.

I hope this helps!

It seems _login() has seen been updated. You can still follow the login procedure above for the most part, but you need to use a token instead of username/password combination with _login():

from huggingface_hub.commands.user import _login
from huggingface_hub import HfApi

api = HfApi()
_login(hf_api = api, token = "TOKEN")

Tokens can be generated in the Access Tokens page of your Hugging Face profile.

Thanks. This looks reasonable, but I haven’t been able to get it to work as yet. I removed huggingface_hub from Databricks and reinstalled it (from Pypi). Looking at the code on GitHub, I can see what you are doing, and token is a keyword argument (although hf_api isn’t).
_login(api, token="TOKEN")
However, I’m getting the same problem. I checked when Pypi was updated last, and it appears to be after the last change to github. Just to be clear, I am using my actual token there, not the string “TOKEN”. My guess is that Databricks didn’t actually upgrade the library.