Chapter 3 questions

Hi, I used python3.11 with vscode, and tried to run the following code locally, but it was unsuccessful.
The same code can be run on colab. I don’t know how to solve this problem, any help would be appreciated. Thanks!

from datasets import load_dataset

raw_datasets = load_dataset("glue", "mrpc")
raw_datasets
Downloading and preparing dataset None/ax to file://C:/Users/qq575/.cache/huggingface/datasets/parquet/ax-1f514e37fba474dd/0.0.0/2a3b91fbd88a2c90d1dbbb32b460cf621d31bd5b05b934492fdef7d8d6f236ec...
Downloading data files: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 3/3 [00:00<00:00, 752.16it/s]
Extracting data files: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 3/3 [00:00<00:00, 150.19it/s] 
Traceback (most recent call last):
  File "e:\python_projects\learning ML\3.1.2.py", line 3, in <module>
    raw_datasets = load_dataset("glue", "mrpc")
                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "D:\anaconda3\envs\python3_11\Lib\site-packages\datasets\load.py", line 1797, in load_dataset
    split_info = self.info.splits[split_generator.name]
                 ~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^
  File "D:\anaconda3\envs\python3_11\Lib\site-packages\datasets\splits.py", line 530, in __getitem__
    instructions = make_file_instructions(
                   ^^^^^^^^^^^^^^^^^^^^^^^
  File "D:\anaconda3\envs\python3_11\Lib\site-packages\datasets\arrow_reader.py", line 112, in make_file_instructions
    name2filenames = {
                     ^
  File "D:\anaconda3\envs\python3_11\Lib\site-packages\datasets\arrow_reader.py", line 113, in <dictcomp>
    info.name: filenames_for_dataset_split(
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "D:\anaconda3\envs\python3_11\Lib\site-packages\datasets\naming.py", line 70, in filenames_for_dataset_split
    prefix = filename_prefix_for_split(dataset_name, split)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "D:\anaconda3\envs\python3_11\Lib\site-packages\datasets\naming.py", line 54, in filename_prefix_for_split
    if os.path.basename(name) != name:
       ^^^^^^^^^^^^^^^^^^^^^^
  File "<frozen ntpath>", line 244, in basename
  File "<frozen ntpath>", line 213, in split
TypeError: expected str, bytes or os.PathLike object, not NoneType

As another person said above, upgrading datasets>=2.17.0 solved the problem

1 Like

your solution is really helpful,thanks a lot

Hi, I’m new to the community. Just a simple question. In Chapter 3 β€œA full training” section, there is only Pytorch version. Does this section also have a tensorflow version as other sections? Thanks!

I’ve got the same error and could not find a solution