Hello, I am currently trying to progressively “widen” my network in a loop. The reason I am using the loop is because I want to compare some properties of the output as I increase the size. I can’t think of anything simpler than the following. I create a network with width that can be configured from the beginning
class onelayerNet(nn.Module):
def __init__(self,scale):
super(onelayerNet, self).__init__()
self.fc1 = nn.Linear(10, 10*scale)
def forward(self, x):
x = F.relu(self.fc1(x))
return x
And then have the following loop
Net = {}
for i in range(iters):
with torch.no_grad():
scale=10*(i+1)
Net[i] = onelayerNet(scale=scale)
if i>0: #copy weights
index =10*10*i #get shape of previous network
Net[i].fc1.weight[0:index,:] = Net[i-1].fc1.weight[:,:] #copy weights
Net[i].to(device)
output = Net[i].forward(images);
# do some stuff to the output
However, I can’t find anyway to remove the previous models from the GPU (as well as intermediate variables that change on each iteration like ‘output’). I’ve tried using del Net[i-1] after copying it’s weights but this doesn’t get removed from the GPU. I’ve tried ’ Net[i].to(‘cpu’)’ after computing the output, however this doesn’t help either.
Is there a smarter way to do this such that I won’t run into this issue? If not, is there a way I can remove unnecessary things from the GPU after they are used?