Chapter 8 questions

Use this topic for any question about Chapter 8 of the course.

I got a “Notebook not found” and “Unable to download notebook.” error for both the colab and studiolab links on the " Debugging the training pipeline" page.

Thanks for reporting @tomjam ! Fixed in Fix notebook link by lewtun · Pull Request #378 · huggingface/course · GitHub

I think there’s a problem with the tensorflow colab notebook. In cell 2, there’s an error with the passing of the columns list:

train_dataset = tokenized_datasets["train"].to_tf_dataset(
    columns=["input_ids", "labels"], batch_size=16, shuffle=True
)

validation_dataset = tokenized_datasets["validation_matched"].to_tf_dataset(
    columns=["input_ids", "labels"], batch_size=16, shuffle=True
)

When I run the entire cell, the error was

ValueError                                Traceback (most recent call last)
<ipython-input-8-f48a238c4772> in <cell line: 20>()
     18 tokenized_datasets = raw_datasets.map(preprocess_function, batched=True)
     19 
---> 20 train_dataset = tokenized_datasets["train"].to_tf_dataset(
     21     columns=["input_ids", "labels"], batch_size=16, shuffle=True
     22 )

/usr/local/lib/python3.10/dist-packages/datasets/arrow_dataset.py in to_tf_dataset(self, batch_size, columns, shuffle, collate_fn, drop_remainder, collate_fn_args, label_cols, prefetch, num_workers, num_test_batches)
    474         for col in columns:
    475             if col not in output_signature:
--> 476                 raise ValueError(f"Column {col} not found in dataset!")
    477 
    478         for col in label_cols:

ValueError: Column labels not found in dataset!

However, I managed to fix the problem by changing the column named "labels" to "label", which then resulted in the intended error message:

ValueError                                Traceback (most recent call last)
<ipython-input-9-3fc8d5924cd0> in <cell line: 32>()
     30 model.compile(loss="sparse_categorical_crossentropy", optimizer="adam")
     31 
---> 32 model.fit(train_dataset)

7 frames
/usr/local/lib/python3.10/dist-packages/transformers/modeling_tf_utils.py in if_body_11()
    222 
    223                     def if_body_11():
--> 224                         raise ag__.converted_call(ag__.ld(ValueError), ('Could not find label column(s) in input dict and no separate labels were provided!',), None, fscope)
    225 
    226                     def else_body_11():

ValueError: in user code:

    File "/usr/local/lib/python3.10/dist-packages/tf_keras/src/engine/training.py", line 1398, in train_function  *
        return step_function(self, iterator)
    File "/usr/local/lib/python3.10/dist-packages/tf_keras/src/engine/training.py", line 1370, in run_step  *
        outputs = model.train_step(data)
    File "/usr/local/lib/python3.10/dist-packages/transformers/modeling_tf_utils.py", line 1661, in train_step  *
        raise ValueError("Could not find label column(s) in input dict and no separate labels were provided!")

    ValueError: Could not find label column(s) in input dict and no separate labels were provided!