Code runs faster on CPU?

Hi I am having some trouble figuring out why my code (in particular the loss function) runs faster on a CPU (~ 0.03 sec) than on a GPU (~ 0.2 sec).

Is there a way I can fix this?

Here is how I define my loss:

def loss(X, u_r):

	t1 = perf_counter() ################################# For testing

	X.requires_grad = True
	u_r_X = torch.zeros(X.shape[0], X.shape[1], X.shape[1], device=device)
	for i in range(X.shape[0]): 
		u_r_X[i,:,:] = torch.autograd.functional.jacobian(u_r, X[i][None,:], create_graph=True)[0,:,0,:]

	energy_tensor = torch.zeros(X.shape[0], 1, device=device)
	for j in range(X.shape[0]):
		F = torch.eye(u_r_X[j].shape[0], device=device) + u_r_X[j]
		energy_tensor[j] = 0.5 * (torch.sum((F)**2) - 2.0) - torch.log((torch.det(F))) + 50.0 * torch.log((torch.det(F)))**2

	t2 = perf_counter() ################################# For testing
	print(t2-t1, ' secs') ################################# For testing

	return torch.mean(energy_tensor)

Hi,

I guess this might be caused because you create new tensors and they have to be moved to the GPU if your device is set to GPU.

Regards,
Unity05

Can I make this faster or will this always be an issue?

You’ll have to try to not having to initialize a new tensor every time you call the loss function.