I experiment a CNN model with the following path:
init model -> K-fold CV with 60% data A -> Validate with 20% data B -> Train with 80% data A+B -> Test with 20% data C
I initiate model
once in every experiment, then Inside CV process I deepcopy model
:
net_cv = deepcopy(model)
And in the last train (which using 80% data A+B), I deepcopy model
again:
net_app = deepcopy(model)
So, net_cv
and net_app
should be independent. I did this to make both model have same initial weight.
But, when I run, the result of net_app
with and without CV Process will give a different result. I don’t know how this is happening. I’ve seed my model
once in every experiment using:
torch.manual_seed(11317)
torch.cuda.manual_seed(11317)
np.random.seed(11317)
torch.backends.cudnn.enabled=False
I hope somebody here can help me find the problem.
Thank You
UPDATE
I’ve found that the problem is only occur when the architecture/model is using dropout function, what happens?