The following error message is confusing. If I have 22Gb of total capacity and only 6 Mb free, how can I check where the rest is going?
OutOfMemoryError: CUDA out of memory. Tried to allocate 24.00 MiB (GPU 0; 21.99 GiB total capacity; 1.04 GiB already allocated; 6.12 MiB free; 1.18 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF
Here is my code:
model = BertModel.from_pretrained('bert-base-uncased')
# Check for GPU availability and set the device accordingly
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
model = model.to(device)
# Cleanup
model.eval()
batch_size = 50
num_batches = len(tokenized_dataset["input_ids"]) // batch_size + (1 if len(tokenized_dataset["input_ids"]) % batch_size != 0 else 0)
# Create an empty list to store the output
results = []
# Forward pass
with torch.no_grad():
for i in range(num_batches):
start = i * batch_size
end = start + batch_size
input_ids = torch.tensor(tokenized_dataset["input_ids"])[start: end].to(device)
attention_mask = torch.tensor(tokenized_dataset["attention_mask"])[start: end].to(device)
# feed the batch into the model
output = model(input_ids, attention_mask= attention_mask)
# Try to free up some memory
del input_ids
del attention_mask
torch.cuda.empty_cache()
# appending the output to the results list
results.append(output)