System Info
-
transformers
version: 4.30.0.dev0 - Platform: Linux-5.4.204-ql-generic-12.0-19-x86_64-with-glibc2.31
- Python version: 3.11.3
- Huggingface_hub version: 0.14.1
- Safetensors version: 0.3.1
- PyTorch version (GPU?): 2.0.1 (True)
Versions of relevant libraries:
[pip3] numpy==1.23.0
[pip3] torch==2.0.1
[pip3] torchaudio==2.0.2
[pip3] torchvision==0.15.2
[conda] blas 1.0 mkl
[conda] ffmpeg 4.3 hf484d3e_0 pytorch
[conda] mkl 2023.1.0 h6d00ec8_46342
[conda] mkl-service 2.4.0 py311h5eee18b_1
[conda] mkl_fft 1.3.6 py311ha02d727_1
[conda] mkl_random 1.2.2 py311ha02d727_1
[conda] numpy 1.23.0 pypi_0 pypi
[conda] pytorch 2.0.1 py3.11_cuda11.8_cudnn8.7.0_0 pytorch
[conda] pytorch-cuda 11.8 h7e8668a_5 pytorch
[conda] pytorch-mutex 1.0 cuda pytorch
[conda] torchaudio 2.0.2 py311_cu118 pytorch
[conda] torchtriton 2.0.0 py311 pytorch
[conda] torchvision 0.15.2 py311_cu118 pytorch
Who can help?
Information
- The official example scripts
- My own modified scripts
Tasks
-
An officially supported task in the
examples
folder (such as GLUE/SQuAD, …) - My own task or dataset (give details below)
Reproduction
ERROR:
train_result = trainer.train
(resume_from_checkpoint=checkpoint)
…
python3.11/site-packages/transformers/feature_extraction_sequence_utils.py
", line 220, in pad
if value.dtype is np.dtype(np.float64):
^^^^^^^^^^^
AttributeError:'str' object
has no attribute ‘dtype’
I am not sure which element of the dataset is read as ‘str’
1. OFFICIAL SCRIPT: transformers/examples/pytorch/audio-classification/run_audio_classification.py
2. LOADED DATASET:
DatasetDict({
train: Dataset({
features: ['audio',
'label'
],
num_rows: 1280
})
validation: Dataset({
features: [‘audio’, ‘label’],
num_rows: 160
})
test: Dataset({
features: [‘audio’, ‘label’],
num_rows: 160
})
3. logger.info(raw_datasets[‘train’][0])
{‘audio’: {'path'
: ‘/transformers/examples/pytorch/audio-classification/s/data/s/s/train/audio1.wav’, 'array'
: array([0.02072144, 0.02767944, 0.03274536, …, 0.00079346, 0.00088501,
0.00149536]), ‘sampling_rate’: 16000}, 'label'
: ‘happy’}
@mariosasko any idea about this?
Expected behavior
load the dataset to model for training in train_result = trainer.train
(resume_from_checkpoint=checkpoint)