Non determinism in vgg

lelouedec · April 18, 2018, 10:15am

hello,
I am using vgg to classify cells in a chessboard, however, it happens to be non deterministic. Its it how I am using it ?
function splitting chess board:

image = Image.open(path_to_image).convert(“RGB”)
width = len(np.array(image)[1])
height = len(np.array(image)[0])
image = totensor(image.resize((int(round(width/100))*224,int(round(height/100))*224),Image.BILINEAR))
#print(image.shape)
for i in range(224,(int(round(width/100))*224)+224,224):##get pieces of the picture
for j in range(224,(int(round(height/100))*224)+224,224):
# plt.imshow(image[:3,i-224:i-10,j-224:j-10].float().permute(1,2,0).numpy())
# plt.show()
piece_try1 = get_piece(totensor(resize(topil(image[:3,i-224:i-10,j-224:j-10]))),False,“./pieces_detection/”)
piece_try2 = get_piece(totensor(resize(topil(image[:3,i-224:i-10,j-224:j-10]))),False,“./pieces_detection/”)
if(piece_try1 == piece_try2):
#print(“first try :”+piece_try1)
if (piece_try1 != “empty_cell”):
return False
else:
piece_try3 = get_piece(totensor(resize(topil(image[:3,i-224:i-10,j-224:j-10]))),False,“./pieces_detection/”)
print(“second try :”+piece_try3)
if (piece_try3 != “empty_cell”):
return False
return True

Classifying the piece :

def get_piece(cell,transf,folder):
dsets = datasets.ImageFolder(folder + ‘dataset/’,transformation)
dset_loaders = torch.utils.data.DataLoader(dsets, batch_size=12, shuffle=False)
classes = dsets.classes
model = torch.load(folder + “model.ckpt”)
if(transf):
input = torch.unsqueeze(transformation(cell),0)
else:
input = torch.unsqueeze(cell,0)
res = model(Variable(input))
_, preds = torch.max(res.data, 1)
return classes[preds[0]]

the network training :

def train():
model = models.vgg16(pretrained=True)
resize = transforms.Resize((224,224))
dsets = datasets.ImageFolder(‘./dataset/’,transformation)
dset_loaders = torch.utils.data.DataLoader(dsets, batch_size=12, shuffle=False)
criterion = nn.CrossEntropyLoss()
optimizer = torch.optim.SGD(model.parameters(),lr=0.001, momentum=0.5)
for i in range(0,200):
lost = 0
for data,labels in dset_loaders:
optimizer.zero_grad()
data = Variable(data)
labels = Variable(labels)
# forward
outputs = model(data)
_, preds = torch.max(outputs.data, 1)
loss = criterion(outputs, labels)
loss.backward()
optimizer.step()
lost+= loss.data[0]
print("loss = " + str(lost/24))
return model

Any idea ?

albanD · April 18, 2018, 11:17am

Hi,

cudnn is not deterministic by default, so if you use it, you should set at the beginning of your script: torch.backends.cudnn.deterministic=True.

lelouedec · April 18, 2018, 11:20am

Well I am not using cuda here I think and even by using

torch.backends.cudnn.deterministic = True

I still get non deterministic results.

albanD · April 18, 2018, 11:28am

Ho on CPU.
I guess you double checked that you set all the random seeds: python random, pytorch, numpy, other stuff if you use more.
Do not use multiprocessing.
Otherwise the rest should be quite deterministic on CPU.

lelouedec · April 18, 2018, 11:34am

I agree for the random seeds, but here I use a pretrained network and don’t use anything random during testing. My problem is for testing, there should not be any non determinism, but there unfortunately is .

albanD · April 18, 2018, 11:50am

I am not sure where it can come from then.
If it is just for testing, you can easily print statistics of the model after loading it to check that they are always the same. Then check that the dataloader content is the same. Then check that the forward of a given example is the same. If these are all the same then the final testing accuracy should be the same all the time.

lelouedec · April 18, 2018, 11:53am

Well up until the forward it is the same, but my forward “sometimes” output something else for the same input.
Which shouldn’t be the case.
I was wondering if there was any explanation in my code Still puzzled I am

albanD · April 18, 2018, 11:55am

Do you put the model in eval() mode? if i remember correctly, vgg’s contains dropout no?

lelouedec · April 18, 2018, 11:58am

oops I thought there wasn’t any dropout because I was only thinking about the features part and not the classifier. I am going to try with .eval