SOLVED: Module 'numpy' has no attribute 'object'. `np.object` was a deprecated alias for the builtin `object`. for train_dataset.map(tokenize, batched=True) in notebook

It looks like you’ve encountered an issue related to the version of the datasets library and its compatibility with the version of numpy. The error message is indicating that there’s a problem with the np.object attribute.

The suggested solution you mentioned is to install version 2.15 of the datasets library. You can do this using the following command in your SageMaker notebook:

bashCopy code

!pip install datasets==2.15

This will install the specified version of the datasets library, and it should help resolve the issue you’re facing. After installing the correct version, you can try running your code again.

Keep in mind that library versions can sometimes introduce breaking changes or incompatibilities with other libraries, so specifying a version that is known to work with your current environment can be crucial. If you encounter any further issues or have specific requirements for library versions, you may need to check the documentation of the libraries you are using or seek further assistance from the community or support channels related to the specific libraries involved.

4 Likes