Hello there,
I am a bit confused about what to send to the GPU.
Take a look at these routines:
def train(model, train_loader, datasize, lr, weight_decay, alpha, epoch):
model.train()
for batch_idx, (data, target) in enumerate(train_loader):
data, target = data.to(device), target.to(device)
model.zero_grad()
output = model(data)
loss = F.nll_loss(output, target)*datasize
loss.backward()
update_params(model, lr, weight_decay, alpha)
def update_params(model, lr, weight_decay, alpha):
for p in model.parameters():
if not hasattr(p,'buf'):
p.buf = torch.randn(p.size()).to(device)*np.sqrt(lr)
d_p = p.grad.data
d_p.add_(weight_decay, p.data)
eps = torch.randn(p.size()).to(device)
buf_new = (1-alpha)*p.buf - lr*d_p + (2.0*lr*alpha)**.5*eps
p.data.add_(buf_new)
p.buf = buf_new
In the train-routine, are the “output” and “loss” variables automatically created on the GPU?
In the “update” routine, are the “lr” and “alpha” constants automatically send to the GPU as well?
What if I have a code snipped like this where “self.model” is already on the GPU:
loss_train = np.zeros(self.epochs+1) # store loss
accu_train = np.zeros(self.epochs+1) # store accuracies
loss_test = np.zeros(self.epochs+1)
accu_test = np.zeros(self.epochs+1)
(loss_train[0], accu_train[0]) = self.model.evaluate(self.train_loader)
(loss_test[0], accu_test[0]) = self.model.evaluate(self.test_loader)
Will the loss and accu arrays be on GPU as well? Or will the results of the right-hand side be sent from GPU to CPU first?
Cheers!