I am trying to check the time taken by two different models(ELMO vs BERT) while predicting in named entity recognition , It involves getting logits data from GPU to CPU which is taking different times with different models.
this code snippet is same for both the models
logits = torch.argmax(F.log_softmax(logits,dim=2),dim=2) batch_start_time = time.time() logits = logits.detach().cpu().numpy() print(logits.shape) pred_batch =  print("Batch time before loop :: %s seconds"%(time.time()-batch_start_time)
output for model ELMO::
(32, 128) Batch time before loop :: 0.0035903453826904297 seconds
output for model BERT::
(32, 128) Batch time before loop :: 0.1682131290435791 seconds
why is this difference and are there any other factors that influence this time.