Hey everyone, the title pretty much says it all. I am using a SageMaker notebook and my kernel is conda_pytorch_p310. When ever I run train_dataset.map(tokenize, batched=True) I get the error message below. I tried trouble shooting this on my own and pip installing an older version of numpy but I’m still not able to resolve this. Any input is appreciated.
AttributeError: module 'numpy' has no attribute 'object'.
`np.object` was a deprecated alias for the builtin `object`. To avoid this error in existing code, use `object` by itself. Doing this will not modify any behavior and is safe.
The aliases was originally deprecated in NumPy 1.20; for more details and guidance see the original release note at:
https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
edit: i read through the discord, the solution is to pip install datasets=2.15
It looks like you’ve encountered an issue related to the version of the datasets
library and its compatibility with the version of numpy
. The error message is indicating that there’s a problem with the np.object
attribute.
The suggested solution you mentioned is to install version 2.15 of the datasets
library. You can do this using the following command in your SageMaker notebook:
bashCopy code
!pip install datasets==2.15
This will install the specified version of the datasets
library, and it should help resolve the issue you’re facing. After installing the correct version, you can try running your code again.
Keep in mind that library versions can sometimes introduce breaking changes or incompatibilities with other libraries, so specifying a version that is known to work with your current environment can be crucial. If you encounter any further issues or have specific requirements for library versions, you may need to check the documentation of the libraries you are using or seek further assistance from the community or support channels related to the specific libraries involved.
4 Likes