Hinduja Swiss (Switzerland) IT-Professional-How do I set up the Python environment for Hugging Face?

Hi everyone, I am Raghav Hinduja Swiss (Switzerland) based IT-professional. I’m setting up my Python environment for working with Hugging Face and I’d love to hear how others in the community usually approach this.

I think a standard Python environment should be fine.

For compatibility, I recommend Python 3.10 to 3.12 and a relatively stable version of PyTorch that supports your GPU. Beyond that, your choice won’t make a significant difference.


What “a Python environment for Hugging Face” usually means

Most people in the community treat it as three layers:

  1. A per-project virtual environment (to avoid dependency conflicts). The official docs repeatedly recommend this. (Hugging Face)
  2. A deep learning backend (usually PyTorch) installed correctly for your hardware (CPU vs GPU). (PyTorch)
  3. The Hugging Face libraries you actually need (e.g., transformers, huggingface_hub, datasets, accelerate) plus auth + cache configuration (often the real source of “it works on my machine” differences). (Hugging Face)

Below is a “works for most people” setup, with clear branch points.


The community-standard setup (recommended baseline)

Step 1) Create a project and a virtual environment

Two common approaches:

A. uv (fast; shown in the Transformers install docs) (Hugging Face)

mkdir hf-project
cd hf-project
uv venv .env
# macOS/Linux
source .env/bin/activate
# Windows: .env\Scripts\activate

B. Standard venv (built-in Python; shown in Hub docs) (Hugging Face)

python -m venv .env
# macOS/Linux
source .env/bin/activate
# Windows
.env\Scripts\activate

Step 2) Install the backend first (PyTorch), matching your hardware

This is where many setups go wrong, especially with GPUs. The PyTorch install page explicitly tells you to pick OS + package manager + compute platform and use the generated command. (PyTorch)

  • CPU-only: choose “CPU” on the selector. (PyTorch)
  • NVIDIA GPU: choose the right CUDA build for your machine on the selector. (PyTorch)

Quick verification after installing torch:

python -c "import torch; print(torch.__version__); print('cuda?', torch.cuda.is_available())"

PyTorch’s own docs recommend torch.cuda.is_available() to verify GPU access. (PyTorch)


Step 3) Install the Hugging Face packages you need

A common “starter set”:

  • transformers (models + pipelines) (Hugging Face)
  • huggingface_hub (authentication, downloads, uploads, cache tooling) (Hugging Face)
  • datasets (dataset loading + caching) (Hugging Face)
  • accelerate (training launcher + config; useful even for single-GPU training) (Hugging Face)

Install:

# pip
pip install -U transformers huggingface_hub datasets accelerate
# or uv
uv pip install -U transformers huggingface_hub datasets accelerate

Step 4) Authenticate (only needed for private/gated repos or uploads)

If you need access to gated/private models or want to push to the Hub:

  • Get a User Access Token (read or write scope depending on what you do). (Hugging Face)
  • Login:
hf auth login

Auth is described in the Hub Quickstart and token docs. (Hugging Face)

For servers/CI, people often set HF_TOKEN (environment variable) instead of interactive login; the env-var reference documents that HF_TOKEN overrides the stored token. (Hugging Face)


Step 5) Set up caching early (prevents disk and permission headaches)

By default, downloaded files go under ~/.cache/huggingface/…. datasets uses:

  • the Hub cache for downloaded source files, and
  • its own cache for processed Arrow data. (Hugging Face)

Recommended practice: set one root cache directory via HF_HOME (especially on servers, containers, shared machines). The Hub env-var docs define HF_HUB_CACHE as $HF_HOME/hub by default. (Hugging Face)

Example (macOS/Linux):

export HF_HOME="/path/to/writable/cache_root"
# optional fine-grained controls:
export HF_HUB_CACHE="/path/to/writable/hub_cache"
export HF_DATASETS_CACHE="/path/to/writable/datasets_cache"

Notes:

  • The datasets community has repeatedly discussed confusion around HF_DATASETS_CACHE vs the Hub cache; the key idea is that they affect different layers. (Hugging Face)

Step 6) If you train/fine-tune: configure Accelerate once

This is the standard workflow:

accelerate config

Accelerate docs recommend running accelerate config before accelerate launch. (Hugging Face)


Step 7) Smoke test (confirms downloads + tokenizers + basic inference)

Transformers docs provide a simple pipeline test. (Hugging Face)

python -c "from transformers import pipeline; print(pipeline('sentiment-analysis')('hugging face is the best'))"

If you’re using NVIDIA GPUs, the Transformers install guide also suggests checking GPU visibility with:

nvidia-smi

(Hugging Face)


Common “your environment will break if…” pitfalls (and how people avoid them)

Pitfall 1) tokenizers fails to install (Rust compiler missing)

You’ll see: “Failed building wheel for tokenizers” / “Can not find Rust compiler.” This is common on Windows and minimal container images. (GitHub)

Typical community fixes:

  • Use a modern Python and installer tooling so you get prebuilt wheels (reduces source builds).
  • Install Rust if a source build is unavoidable (common workaround noted in long-running issues). (GitHub)
  • On Windows, many people prefer conda to avoid compilation friction.

Pitfall 2) Old tutorials break on cached_download

A lot of older notebooks import cached_download from huggingface_hub. That function was removed; modern code should use hf_hub_download. This shows up frequently in Stack Overflow answers and downstream project issues. (Stack Overflow)

Rule of thumb:

  • If a tutorial mentions cached_download, treat it as outdated and update code to hf_hub_download().

Pitfall 3) Cache variables changed (Transformers v5)

Community threads note TRANSFORMERS_CACHE is removed in Transformers v5 and the recommended approach is using HF_HOME / Hub cache variables instead. (Hugging Face Forums)


“Pick your path” quick recipes

A) Inference-only on CPU (simplest)

  1. Create env (uv/venv) (Hugging Face)
  2. Install CPU torch (PyTorch selector) (PyTorch)
  3. pip install transformers huggingface_hub (Hugging Face)
  4. Run pipeline smoke test (Hugging Face)

B) NVIDIA GPU inference or training

  1. Install correct NVIDIA driver
  2. Install torch with the correct CUDA build (PyTorch selector) (PyTorch)
  3. Install HF libs (transformers, accelerate, etc.) (Hugging Face)
  4. Verify torch.cuda.is_available() (PyTorch)

C) Server / container / shared machine

  1. Set HF_HOME to a writable, sufficiently large volume (Hugging Face)
  2. Keep per-project envs
  3. Prefer non-interactive auth via HF_TOKEN in CI (Hugging Face)

High-quality references to keep open

  • Transformers install + verification + GPU notes (Hugging Face)
  • Hub cache guide + env vars (HF_HOME, HF_HUB_CACHE, HF_TOKEN) (Hugging Face)
  • Datasets cache behavior (Hub cache vs datasets cache) (Hugging Face)
  • Accelerate install + accelerate config workflow (Hugging Face)
  • PyTorch install selector + verification (PyTorch)