ImportError: cannot import name 'load_dataset' from 'datasets' (unknown location)

Hey,

I am new to working with NLP and working through the tutorial. I installed the transformers library and after some trouble everything worked out. Now I tried to install the datasets library, installation went alright (details at end)

Now I’m trying to work with it in jupyter notebook. The line

import datasets

works out fine, but when I try

from datasets import load_dataset

I get the error from above. I looked around in this forum and also others and couldn’t find a solution.
I am using Python 3.9.12, Pytorch 1.12.0 and Tranformers version 4.22.0.dev0.

Any help is appreciated!:slight_smile:

Details regarding installation:
I installed in from source using pip3 inside a virtual environment; put it in the directory from which I also work on my project; checked the installation which is also suggested in the installation “tutorial” and this also worked out

1 Like

Hi!

It’s strange, your code is ok… Did you check if the virtual environment is enabled in the notebook? Can you copy the error log?

Hey, thanks rwheel!

I think I enabled the environment, as I included it with (.env is the name of the environment)

python -m ipykernel install --user --name=.env

and got the acception

Installed kernelspec .env in /home/username/.local/share/jupyter/kernels/.env

I’m sorry but what do you mean by error log? And where do I find it?

Sorry, I was referring to the error you get when you do from datasets import load_dataset

After including the virtual environment to the jupyter notebook, did you change the kernel to the created env before running the script?

yes I did:/ I really can’t think of anything else to do haha. I deinstalled datasets again and reinstalled it and also double checked it by letting me show all the installed packages but now even import datasets doesnt work and I get the error ModuleNotFoundError: No module named 'datasets'

Is there anything I can do? Could it be a problem that I tried with different environments and they are somehow affecting eachother? Or could I try installing it outside of an environment?

It is very weird :expressionless:

I’ve just replicated the example in a google colab and it works well. So I also think, as you say, that it could be a problem between the environments… I work with conda to create and manage my environments, did you try that tool?

PS: the code that I tried in google colab is:

! pip install datasets

from datasets import load_dataset

Hi
I have the same problem. I restart the kernel and it’s OK

2 Likes

The same phenomenon happens to me as the questioner I’ve installed everything

I also met this problem today and successfully fixed it in my computer. There are two reason I found.

First is the path problem. By running code

import sys
list = sys.path
print(list)

we may found that there is no path:EnvPath\lib\site-packages, where save our packages installed by pip command. So what we should do is add the path to system’s search range.
For example, my python download packages in the path E:\program Files (x86)\Python 3.6\Lib\site-packages while the system only search the path D:\Dev\Anaconda\envs\pytorch\lib\site-packages.
To solve this problem, we can visit the path D:\Dev\Anaconda\envs\pytorch\lib\site-packages and creat a txt file and save the path infomation:E:\program Files (x86)\Python 3.6\Lib\site-packages. After saving the file, change the txt format to pth format, like mypath.pth. In this way ,the problem may solved.

Second is the filename problem. Change the file name from datasets.py to other name like image_datasets.

If the problem still exist, please provide more details to try solving together :slight_smile:

1 Like

Sometimes restarting the kernel do the trick. It worked for me.