Getting errors while running google Collab's code attached to chapter 3

manichandra489 · June 23, 2025, 7:31am

Hi folks,

I am trying to run the preprocessing code that was provided in google collab, and i got below error, while I replaced the line [from transformers import AdamW, AutoTokenizer, AutoModelForSequenceClassification] with the line [from torch.optim import AdamW], the below error got resolved, yet I wanted to know the context of it.
Error : ImportError: cannot import name ‘AdamW’ from ‘transformers’ (/usr/local/lib/python3.11/dist-packages/transformers/init.py)

Also, I got another error in the next cell for which I couldn’t find solution to
Error: ValueError: Invalid pattern: '’ can only be an entire path component**

Kindly help me with this, thank you.

John6666 · June 23, 2025, 7:45am

Here is the background information regarding the removal of AdamW from the Transoformers library.

github.com/stanford-futuredata/ColBERT

transformers.AdamW has been removed

opened 05:51PM - 27 Mar 25 UTC

jcalebsmith

`transformers.AdamW` has been deprecated with a warning for some time and was re…moved in the last version of the `transformers` package. It hasn't been necessary since an AdamW optimizer was added to torch. Please update the code to to use the native torch version of AdamW. You'll get an error like the following until that's changed: ``` Traceback (most recent call last): File "<frozen runpy>", line 189, in _run_module_as_main File "<frozen runpy>", line 112, in _get_module_details File "C:\Users\test\anaconda3\envs\venv\Lib\site-packages\colbert\__init__.py", line 1, in <module> from .trainer import Trainer File "C:\Users\calebs\anaconda3\envs\aider\Lib\site-packages\colbert\trainer.py", line 5, in <module> from colbert.training.training import train File "C:\Users\test\anaconda3\envs\aider\Lib\site-packages\colbert\training\training.py", line 7, in <module> from transformers import AdamW, get_linear_schedule_with_warmup ImportError: cannot import name 'AdamW' from 'transformers' (C:\Users\test\anaconda3\envs\venv\Lib\site-packages\transformers\__init__.py) ``` Until then, users can `pip install use transformers==4.49.0` to continue using the colbert-ai package

github.com/huggingface/transformers

AdamW in HuggingFace is different from AdamW in Pytorch

opened 08:32AM - 24 Mar 20 UTC

closed 05:52AM - 31 May 20 UTC

songsuoyuan

wontfix

# ❓ Question I just noticed that the implementation of AdamW in HuggingFace i…s different from PyTorch. The previous AdamW first updates the gradient then apply the weight decay. However, in the paper (Decoupled Weight Decay Regularization, link: https://arxiv.org/abs/1711.05101) and the implementation of Pytorch, the AdamW first apply the weight decay then update the gradient. I was wondering if the two approaches are the same. Thanks! (In my opinion, they are not the same procedure.) HuggingFace: ```python for group in self.param_groups: for p in group["params"]: ... # Decay the first and second moment running average coefficient # In-place operations to update the averages at the same time exp_avg.mul_(beta1).add_(1.0 - beta1, grad) exp_avg_sq.mul_(beta2).addcmul_(1.0 - beta2, grad, grad) denom = exp_avg_sq.sqrt().add_(group["eps"]) step_size = group["lr"] if group["correct_bias"]: # No bias correction for Bert bias_correction1 = 1.0 - beta1 ** state["step"] bias_correction2 = 1.0 - beta2 ** state["step"] step_size = step_size * math.sqrt(bias_correction2) / bias_correction1 p.data.addcdiv_(-step_size, exp_avg, denom) # Just adding the square of the weights to the loss function is *not* # the correct way of using L2 regularization/weight decay with Adam, # since that will interact with the m and v parameters in strange ways. # # Instead we want to decay the weights in a manner that doesn't interact # with the m/v parameters. This is equivalent to adding the square # of the weights to the loss with plain (non-momentum) SGD. # Add weight decay at the end (fixed version) if group["weight_decay"] > 0.0: p.data.add_(-group["lr"] * group["weight_decay"], p.data) ``` Pytorch: ```python for group in self.param_groups: for p in group['params']: ... # Perform stepweight decay p.data.mul_(1 - group['lr'] * group['weight_decay']) exp_avg, exp_avg_sq = state['exp_avg'], state['exp_avg_sq'] if amsgrad: max_exp_avg_sq = state['max_exp_avg_sq'] beta1, beta2 = group['betas'] state['step'] += 1 bias_correction1 = 1 - beta1 ** state['step'] bias_correction2 = 1 - beta2 ** state['step'] # Decay the first and second moment running average coefficient exp_avg.mul_(beta1).add_(1 - beta1, grad) exp_avg_sq.mul_(beta2).addcmul_(1 - beta2, grad, grad) if amsgrad: # Maintains the maximum of all 2nd moment running avg. till now torch.max(max_exp_avg_sq, exp_avg_sq, out=max_exp_avg_sq) # Use the max. for normalizing running avg. of gradient denom = (max_exp_avg_sq.sqrt() / math.sqrt(bias_correction2)).add_(group['eps']) else: denom = (exp_avg_sq.sqrt() / math.sqrt(bias_correction2)).add_(group['eps']) step_size = group['lr'] / bias_correction1 p.data.addcdiv_(-step_size, exp_avg, denom) ```

John6666 · June 23, 2025, 8:02am

In first cell,

!pip install -U datasets evaluate transformers==4.49.0 sentencepiece huggingface_hub fsspec

This change made it work. An older version of the datasets library was being used.

github.com/huggingface/datasets

ValueError: Invalid pattern: '**' can only be an entire path component [Colab]

opened 01:46PM - 27 May 25 UTC

closed 01:26AM - 30 May 25 UTC

wkambale

### Describe the bug I have a dataset on HF [here](https://huggingface.co/datas…ets/kambale/luganda-english-parallel-corpus) that i've previously used to train a translation model [here](https://huggingface.co/kambale/pearl-11m-translate). now i changed a few hyperparameters to increase number of tokens for the model, increase Transformer layers, and all however, when i try to load the dataset, this error keeps coming up.. i have tried everything.. i have re-written the code a hundred times, and this keep coming up ### Steps to reproduce the bug Imports: ```bash !pip install datasets huggingface_hub fsspec ``` Python code: ```python from datasets import load_dataset HF_DATASET_NAME = "kambale/luganda-english-parallel-corpus" # Load the dataset try: if not HF_DATASET_NAME or HF_DATASET_NAME == "YOUR_HF_DATASET_NAME": raise ValueError( "Please provide a valid Hugging Face dataset name." ) dataset = load_dataset(HF_DATASET_NAME) # Omitted code as the error happens on the line above except ValueError as ve: print(f"Configuration Error: {ve}") raise except Exception as e: print(f"An error occurred while loading the dataset '{HF_DATASET_NAME}': {e}") raise e ``` now, i have tried going through this [issue](https://github.com/huggingface/datasets/issues/6737) and nothing helps ### Expected behavior loading the dataset successfully and perform splits (train, test, validation) ### Environment info from the imports, i do not install specific versions of these libraries, so the latest or available version is installed * `datasets` version: latest * `Platform`: Google Colab * `Hardware`: NVIDIA A100 GPU * `Python` version: latest * `huggingface_hub` version: latest * `fsspec` version: latest

manichandra489 · June 24, 2025, 6:54pm

Thanks a lot John !! It helped.

Topic		Replies	Views
Error when following Transformers Language modeling tutorial step by step Beginners	1	2087	July 28, 2022
KeyError: 0 issue with trainer Beginners	0	1318	March 28, 2023
How to resolve the hugging face error ImportError: cannot import name 'is_tokenizers_available' from 'transformers.utils'? Beginners	6	83537	October 21, 2024
Meta device error while instantiating model 🤗Accelerate	5	6953	April 1, 2025
FutureWarning: This implementation of AdamW is deprecated and will be removed in a future version. Use the PyTorch implementation torch.optim.AdamW instead Beginners	2	5109	July 9, 2023

Getting errors while running google Collab's code attached to chapter 3

Related topics