I have a setup something like this:
model = BERT()
model.to(device)
model.eval()
result = []
for data in batch:
input_ids = data.to(device)
token_ids = data.to(device)
attention_mask = data.to(device)
x = model(input_ids, token_ids, attention_mask)
y = x.cpu().detach().numpy()
result+= covert_to_string(y)
I don’t see GPU being used, I checked using nvidia-smi also. The GPU memory is about 75% being used but not the GPU cores for computation. What am I doing wrong?