Dimension problem

It does make sense if “gettting batch” function is transparent, but I guess it’s hidden behind this function:

from transformers import TrainingArguments

epochs = 50
lr = 0.00006
batch_size = 2

hub_model_id = "nvidia/segformer-b0-finetuned-ade-512-512"

training_args = TrainingArguments(
    "segformer-b0-finetuned-ade20k-manggarai_rivergate",
    learning_rate=lr,
    num_train_epochs=epochs,
    per_device_train_batch_size=batch_size,
    per_device_eval_batch_size=batch_size,
    save_total_limit=3,
    evaluation_strategy="steps",
    save_strategy="steps",
    save_steps=20,
    eval_steps=20,
    logging_steps=1,
    eval_accumulation_steps=5,
    load_best_model_at_end=True,
    push_to_hub=True,
    hub_model_id=hub_model_id,
    hub_strategy="end",
)

If you see the stackoverflow link I gave before, I actually have been spending 3 days worth of my life fixing that “out of bounds” looping on List XD

Let me read that error probably it’s indeed the batch problem…

1 Like

It’s either a problem with the batch or a problem with the range of the iterator index. Probably the batch is the problem.

It is actually implied when I look up for answers that

import torch
from torch import nn
import evaluate

metric = evaluate.load("mean_iou")

def compute_metrics(eval_pred):
  with torch.no_grad():
    logits, labels = eval_pred
    logits_tensor = torch.from_numpy(logits)
    # scale the logits to the size of the label
    logits_tensor = nn.functional.interpolate(
        logits_tensor,
        size=labels.shape[-2:],
        mode="bilinear",
        align_corners=False,
    ).argmax(dim=1)

    pred_labels = logits_tensor.detach().cpu().numpy()
    # currently using _compute instead of compute
    # see this issue for more info: https://github.com/huggingface/evaluate/pull/328#issuecomment-1286866576
    metrics = metric._compute(
            predictions=pred_labels,
            references=labels,
            num_labels=len(id2label),
            ignore_index=0,
            reduce_labels=feature_extractor.reduce_labels,
        )
    
    # add per category metrics as individual key-value pairs
    per_category_accuracy = metrics.pop("per_category_accuracy").tolist()
    per_category_iou = metrics.pop("per_category_iou").tolist()

    metrics.update({f"accuracy_{id2label[i]}": v for i, v in enumerate(per_category_accuracy)})
    metrics.update({f"iou_{id2label[i]}": v for i, v in enumerate(per_category_iou)})
    
    return metrics

that num_labels=len(id2abel) might relevant to the problem.

It’s actually make sense since I made my own id2label.json. Instinctively I add {0: “water”, 1:“not_water”} because I thought the transformer would know that dark parts is the black one of the mask, the white one is 1. Was it not the case?

After all these we back to id2labels again…

1 Like

I have changed the id2labels to the size of what previous model should be by copying the content from nvidia/segformer-b0-finetuned-ade-512-512. But now the out of bounds are ‘241’ instead of 11.

So
size 2 id2label will create 11
size 150 id2label will create 241

1 Like

If you suspect a variable is suspicious, print it out for now. print(id2label)
If you can figure out what the difference is between the code and data that actually works correctly, you’ll automatically know what you need to do. In this case, we don’t know what that is yet…

Yes I have printed both size respectively, I think I’ll do a lookup first before asking this again.

seems like the previous problem is solved too, thanks John.

1 Like

This topic was automatically closed 12 hours after the last reply. New replies are no longer allowed.