ValueError: Input sequence length (100) doesn't match model configuration (32)

jeroneas · September 11, 2024, 12:44pm

Traceback (most recent call last):
File “/home/jerone/Documents/patch tst/main.py”, line 117, in
main()
File “/home/jerone/Documents/patch tst/main.py”, line 90, in main
output = model(X_train_patches)
File “/home/jerone/Documents/patch tst/venv/lib/python3.10/site-packages/torch/nn/modules/module.py”, line 1553, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File “/home/jerone/Documents/patch tst/venv/lib/python3.10/site-packages/torch/nn/modules/module.py”, line 1562, in _call_impl
return forward_call(*args, **kwargs)
File “/home/jerone/Documents/patch tst/venv/lib/python3.10/site-packages/transformers/models/patchtst/modeling_patchtst.py”, line 1232, in forward
patched_values = self.patchifier(scaled_past_values)
File “/home/jerone/Documents/patch tst/venv/lib/python3.10/site-packages/torch/nn/modules/module.py”, line 1553, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File “/home/jerone/Documents/patch tst/venv/lib/python3.10/site-packages/torch/nn/modules/module.py”, line 1562, in _call_impl
return forward_call(*args, **kwargs)
File “/home/jerone/Documents/patch tst/venv/lib/python3.10/site-packages/transformers/models/patchtst/modeling_patchtst.py”, line 380, in forward
raise ValueError(
ValueError: Input sequence length (100) doesn’t match model configuration (32).

mahmutc · September 12, 2024, 11:54am

hi @jeroneas
Can you please share the script or relevant code snippet?

jeroneas · September 16, 2024, 4:49am

import pandas as pd
import numpy as np
import torch
from sklearn.preprocessing import StandardScaler
from sklearn.model_selection import train_test_split
from sklearn.metrics import mean_squared_error, mean_absolute_error
from transformers import PatchTSTConfig, PatchTSTModel

Define the pad_data function

def pad_data(tensor, patch_size, input_size):
num_elements = tensor.numel()
expected_elements = patch_size * input_size
if num_elements % expected_elements == 0:
return tensor
pad_size = expected_elements - (num_elements % expected_elements)

# Reshape tensor to 1D, pad, and reshape back
tensor_1d = tensor.flatten()
padded_tensor = torch.cat([tensor_1d, torch.zeros(pad_size, dtype=tensor.dtype)])
return padded_tensor.reshape(-1, patch_size, input_size)

def main():
# Load the local dataset
file_path = “/home/jerone/python AI project/PatchTST-main/PatchTST_supervised/dataset/mac_dataset_with_date.xls”

# Load the Excel file using pandas
data = pd.read_excel(file_path)

# Preprocess the data
features = data.drop(columns=['DO'])  # Drop the target column (DO)
target = data['DO']

# Split the data into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(features, target, test_size=0.2, random_state=42)

# Standardize the feature data
scaler = StandardScaler()
X_train = scaler.fit_transform(X_train)  # Scaling only numeric columns
X_test = scaler.transform(X_test)

# Convert data to PyTorch tensors
X_train_tensor = torch.tensor(X_train, dtype=torch.float32)
X_test_tensor = torch.tensor(X_test, dtype=torch.float32)
y_train_tensor = torch.tensor(y_train.values, dtype=torch.float32)
y_test_tensor = torch.tensor(y_test.values, dtype=torch.float32)

# Define PatchTST configuration
input_size = X_train_tensor.shape[1]  # Input size (number of features)
patch_size = 50  # Example patch size from factors (change as needed)

# Ensure reshaping is valid
num_elements = X_train_tensor.numel()
expected_elements = patch_size * input_size
if num_elements % expected_elements != 0:
    # Use padding to fit the required dimensions
    X_train_tensor = pad_data(X_train_tensor, patch_size, input_size)
    X_test_tensor = pad_data(X_test_tensor, patch_size, input_size)

# Initialize the model
config = PatchTSTConfig(
    input_size=input_size,
    horizon=1,
    patch_size=patch_size,
    n_heads=8,
    d_model=128,
    num_layers=4,
    dropout=0.1,
    lr=0.001,
    batch_size=50,
    epochs=50,
)
model = PatchTSTModel(config)

# Reshape the input into patches with the correct shape
X_train_patches = X_train_tensor.reshape(-1, patch_size, input_size)
X_test_patches = X_test_tensor.reshape(-1, patch_size, input_size)

print(f"X_train_patches shape: {X_train_patches.shape}")  # Ensure it is [batch_size, patch_size, input_size]

# Define optimizer and loss function
optimizer = torch.optim.Adam(model.parameters(), lr=config.lr)
loss_fn = torch.nn.MSELoss()

# Training loop
for epoch in range(config.epochs):
    model.train()
    optimizer.zero_grad()
    
    # Forward pass
    output = model(X_train_patches)
    loss = loss_fn(output.squeeze(), y_train_tensor[:output.size(0)])  # Match the target size
    
    # Backward pass
    loss.backward()
    optimizer.step()
    
    print(f'Epoch {epoch+1}/{config.epochs}, Loss: {loss.item()}')

# Evaluation phase
model.eval()
with torch.no_grad():
    predictions = model(X_test_patches)

predictions = predictions.squeeze().numpy()
y_test = y_test_tensor[:predictions.shape[0]].numpy()

# Evaluation metrics: RMSE, MAE, MAPE
rmse = mean_squared_error(y_test, predictions, squared=False)
mae = mean_absolute_error(y_test, predictions)
mape = np.mean(np.abs((y_test - predictions) / y_test)) * 100

print(f'RMSE: {rmse}')
print(f'MAE: {mae}')
print(f'MAPE: {mape}%')

if name == “main”:
main()

jeroneas · September 19, 2024, 5:23am

hi @mahmutc can you please help me

swtb · September 19, 2024, 12:16pm

possibly relevant, though CV models are notoriously difficult to debug… this line would seem to suggest you are giving the model all of the patches (which seems to be your full dataset) in one go.

In anycase the road to debugging this is:

Identify who is expecting 32 inputs.
Identify who is giving 100 inputs.

Decide who is right. Fix it.

swtb · September 19, 2024, 3:40pm

Actually, looking at the config for the model: PatchTST (huggingface.co)

Looks like it is configured to take a context length of 32. I wonder if you are supplying 100 patches in your example and the model is only able to handle 32

jeroneas · September 20, 2024, 5:03am

@swtb We are currently learning about PatchTST, and we made some mistakes due to uncertainty regarding its configuration. I am unfamiliar with how to configure PatchTST correctly. Could you kindly provide a basic example or guide us through the setup?

Your help would be greatly appreciated.

Thanks in advance for your support.

mahmutc · September 20, 2024, 11:47am

hi @jeroneas
Sorry for late reply. Did you read the related blog post: Patch Time Series Transformer in Hugging Face ?

I’m not sure but changing patch_size from 50 to 16 might be helpful for you.