As the other similar problem describe. When I try resume training, it start at a random prediction point. And so far I cannot find a solution.
here’s what I have done:
-
using different dataset:
- I’m using mnist with
Zerospadding(114)
whose size is (256,256). And after reloading and do prediction it returns a high accuracy, so I think the reloading works fine inmnist dataset
- And I also use
torch.ones(input_shape)
to train and evaluate, reload data can output a same result as trained model does. - use only one spectrum(from my audio datasets), model can predict well after reloading in different session.
- I’m using mnist with
-
training for a while and reload it in the same session: it gives me a high accuracy as well as the model after some training does.but in different session it doesn’t work anymore.
-
As for the method of yeilding data. I’m using
python generator
to yiled data then transfor it intoTensor
. But I also triedpytorch.utils.data.Datasets
andDataLoader
. Unfortunately, it cannot work either. -
According to the operation No.3, it is may not dataset problem? And I think it is may
AvgPool2d
's problem? I replace it withAdaptiveAvgPool2d
, and it cannot work again…
here’s my model script(MobileNetv1)
import torch.nn as nn
import torch.nn.functional as F
class MobileNet1(nn.Module):
def __init__(self,initial_channel,n_class):
super(MobileNet1, self).__init__()
self.class_num = n_class
def conv_bn(inp, oup, stride):
return nn.Sequential(
nn.Conv2d(inp, oup, 3, stride, 1, bias=False),
nn.BatchNorm2d(oup),
nn.ReLU(inplace=True)
)
def conv_dw(inp, oup, stride):
return nn.Sequential(
nn.Conv2d(inp, inp, 3, stride, 1, groups=inp, bias=False),
nn.BatchNorm2d(inp),
nn.ReLU(inplace=True),
nn.Conv2d(inp, oup, 1, 1, 0, bias=False),
nn.BatchNorm2d(oup),
nn.ReLU(inplace=True),
)
self.model = nn.Sequential(
conv_bn(initial_channel, 32, 2),
conv_dw(32, 64, 1),
conv_dw(64, 128, 2),
conv_dw(128, 128, 1),
conv_dw(128, 256, 2),
conv_dw(256, 256, 1),
conv_dw(256, 512, 2),
conv_dw(512, 512, 1),
conv_dw(512, 512, 1),
conv_dw(512, 512, 1),
conv_dw(512, 512, 1),
conv_dw(512, 512, 1),
conv_dw(512, 1024, 2),
conv_dw(1024, 1024, 1),
nn.AvgPool2d(8),
)
self.fc = nn.Linear(1024, self.class_num)
def forward(self, x):
x = self.model(x)
x = x.view(-1, 1024)
x = self.fc(x)
return x
here’s my simple description for main code
#-----------------------fix random seed---------------------------------#
args.seed = 5153
print("Random Seed: ", args.seed)
random.seed(args.seed)
np.random.seed(args.seed)
torch.manual_seed(args.seed)
if args.gpus:
# Sets the seed for generating random numbers on all GPUs.
torch.cuda.manual_seed(args.seed)
torch.cuda.manual_seed_all(args.seed)
#---------------------------training-------------------------------------#
model = MobileNet1()
model.load_state_dict(weight['state_dict'])
data = generator(data_path) # generator is a python generator
for epoch in range(epochs):
for x, y in data:
output = model(x)
loss = criterion(output, y)
optimizer.zero_grad()
loss /= accumulate_step
loss.backward()
optimizer.step()
scheduler.step()
save_checkpoint(filepath=args.save,
filename='{}-epoch{}-val_loss{:.4f}.pth'.format(
args.model_name, epoch, val_loss),
state={'epoch': epoch , 'state_dict':
model.state_dict(), 'best_prec1': best_test,
'optimizer': optimizer.state_dict()},
)