AttributeError: module 'fsspec' has no attribute 'asyn'

kashif09 · June 16, 2022, 5:11pm

i am getting this error, i am using Bart model, i have already fine tuned this model using trainer Seq2SeqTrainer, and output dir path i have given for my google drive, now i am trying to resume from last checkpoint using ‘resume_from_checkpoint’ argument but i am getting this error. Here is my code and the dataset i have used is IterableDataset.

tokenized_datasets = tokenized_datasets.with_format(“torch”)
training_args = Seq2SeqTrainingArguments(
output_dir="/content/gdrive/My Drive/Colab Notebooks/Code/models",
evaluation_strategy=“epoch”,
learning_rate=3e-5,
per_device_train_batch_size=4,
per_device_eval_batch_size=2,
weight_decay=0.01,
save_total_limit=1,
num_train_epochs=5,
predict_with_generate=True,
fp16=True,
save_strategy=“epoch”,
metric_for_best_model=“eval_rouge1”,
greater_is_better=True,
seed=41,
generation_max_length=max_target_length,max_steps=10000,load_best_model_at_end=True,
resume_from_checkpoint="/content/gdrive/My Drive/Colab Notebooks/Code/models"
)

trainer = Seq2SeqTrainer(
model=model,
args=training_args,
train_dataset=tokenized_datasets[“train”],
eval_dataset=tokenized_datasets[“validation”],
tokenizer=tokenizer,
data_collator=data_collator,compute_metrics=compute_metrics,
callbacks = [EarlyStoppingCallback(early_stopping_patience = 3,early_stopping_threshold=0.0)]
)

trainer.train()

jacobbieker · July 1, 2022, 7:31am

Just chiming in that I am seeing the same issue with Datasets

psyche · July 2, 2022, 4:26am

I solve this problems by adding ‘asyn’ to the “init.py” in ‘fsspec’ library like this,

from . import asyn

__all__ = [
'asyn',
...
]

It’s because the latest version of ‘fssec’ didn’t allow direct access like “fsspec.asyn”

psyche · July 4, 2022, 12:16am

the alternative way is to use the “interleave_datasets” from “datasets” library like “inerleave_dataset([train_dataset])” (interleave single dataset is the same with original dataset but it works well differently.)

conceptofmind · July 5, 2022, 6:34pm

Hi,

I am also receiving this same error when streaming datasets.

Best,

Enrico

woqucc · July 14, 2022, 9:27am

You saved my life. Many many thanks for this god-like solution.

MUmarAmanat · July 27, 2023, 2:16pm

In my case, I install a specific version of the datasets library.
pip install datasets==2.11.0

Topic		Replies	Views
I'm facing a NotImplementedError: Loading a dataset cached in a LocalFileSystem is not supported error in Google Colab when loading the dataset 'Osondu/reddit_autism_dataset'. I've tried restarting the runtime and setting a custom cache_dir, but the issue 🤗Datasets	2	44	July 21, 2025
Error "TypeError: not a path-like object" when iterating through a streamed dataset 🤗Datasets	3	541	September 8, 2022
Loading dataset with streaming model Beginners	4	1015	March 11, 2024
TRL importation error on Kaggle Beginners	10	6002	March 25, 2024
NotImplementedError when solidifying a streaming dataset 🤗Datasets	11	2934	November 23, 2023

AttributeError: module 'fsspec' has no attribute 'asyn'

Related topics