I’m having trouble while fine-tuning pretrained monolingual Roberta model.
This is my custom dataset:
class VnParaDataset(Dataset):
def __init__(self, encodings):
self.encodings = encodings
def __len__(self):
return len(self.encodings)
def __getitem__(self, index):
item = {key: torch.tensor(val[index]) for key, val in self.encodings.items()}
return item
‘input_ids’,‘token_type_ids’,‘attention_mask’ of encodings has shape of [5440,193].
My Dataloader is:
train_loader = DataLoader(train_dataset, batch_size=32, shuffle=True)
However, when I loop through batches, my batch have shape of [3,193] and len(train_loader)=1:
for epoch in range(epochs):
count = 0
for batch in train_loader:
print('Epoch {0} - Batch {1}'.format(epoch, count))
optimizer.zero_grad()
input_ids = batch['input_ids'].to(device)
attention_mask = batch['attention_mask'].to(device)
outputs = phobert(input_ids, attention_mask=attention_mask, token_type_ids=None)
loss = outputs[0]
loss.mean().backward()
optimizer.step()
count += 1
Shouldn’t the shape of one batch be [32,193] and there are 5440/32=170 batches in train_loader?