I’m trying to train a classifier on 15k images over five categories using googlenet architecture.
I followed the fine-tune tutorial (but used as pretrained=false just to train from scratch).
But the training is only possible if i set the ‘aux logits as false’
‘’’
model.aux.logits=False
‘’’
Can someone explain why I have to do this for training?
Do you get an error if you leave aux_logits=True
?
As far as I remember they were in fact only used during training in the original paper.
Hi thanks. I’m following the pytorch/examples/imagenet script for training. I also seen your similar comments after posting this topic.
When I set true and changed
loss = criterion(output[0], target)
then in the script, wherever there is an output
it gives dimension error.
If i don’t use aux_logits=False
, will that effect the val_acc
?
Could you post the dimension error?
Since the Inception model is quite deep, the auxiliary loss was used to stabilize the training.
If you are training from scratch, using the aux_loss
might help.
1 Like
Thanks. I used this approach from the pytorch tutorial on fine-tune inception and modified the examples/imagenet/train.py
outputs, aux_outputs = model(inputs)
loss1 = criterion(outputs, target)
loss2 = criterion(aux_outputs, target)
loss = loss1 + 0.4*loss2
now the error :
Traceback (most recent call last):
File "imagenet.py", line 406, in <module>
main()
File "imagenet.py", line 113, in main
main_worker(args.gpu, ngpus_per_node, args)
File "imagenet.py", line 239, in main_worker
train(train_loader, model, criterion, optimizer, epoch, args)
File "imagenet.py", line 279, in train
output, aux_outputs = model(input)
ValueError: too many values to unpack (expected 2)
Is your model is train()
and did you leave aux_logits=True
?
Hi. Yes. It is in train()
and leave aux_logits=True
.
imagenet based script process over the batch iteration for one epoch, then before second epoch it gave me that error. It tries to go the eval()
for the first epoch to give the acc
over that batch right?
Below is the snippet for main_worker
function.
def main_worker(gpu, ngpus_per_node, args):
.....
.....
.....
for epoch in range(args.start_epoch, args.epochs):
if args.distributed:
train_sampler.set_epoch(epoch)
adjust_learning_rate(optimizer, epoch, args)
# train for one epoch
train(train_loader, model, criterion, optimizer, epoch, args)
# evaluate on validation set
acc1 = validate(val_loader, model, criterion, args)
# remember best acc@1 and save checkpoint
is_best = acc1 > best_acc1
best_acc1 = max(acc1, best_acc1)
if not args.multiprocessing_distributed or (args.multiprocessing_distributed
and args.rank % ngpus_per_node == 0):
save_checkpoint({
'epoch': epoch + 1,
'arch': args.arch,
'state_dict': model.state_dict(),
'best_acc1': best_acc1,
'optimizer' : optimizer.state_dict(),
}, is_best)
It seems that the model might still be in eval()
after the first epoch.
The aux_logits
will only be returned in train()
mode, so make sure to activate it before the next epoch.
No, it didn’t went to eval()
, its just done with the train()
, before going to eval()
, error thrown.
I just chaged the def train()
params, set def validate()
as it is. am I incorrect?
imagenet script organization:
def main()
def main_worker()
def train()
outputs, aux_outputs = model(inputs)
loss1 = criterion(outputs, target)
loss2 = criterion(aux_outputs, target)
loss = loss1 + 0.4*loss2
def validate()
outputs = model(inputs)
loss = criterion(outputs, target)
def adjust_learning_rate()
def accuracy()
Try to set the desired mode specifically in both functions:
def train():
model.train()
...
def validate():
model.eval()
...
Yeah, set to desired mode in both functions. and gave below command
python imagenet.py -a googlenet --epochs 30 --batch-size 96 --gpu 1 data/
And you still get the ValueError
?
Could you create a gist so that I could have a look at the complete code?
Yes, I still get the error. ;/ , usually why this error raises
ValueError: too many values to unpack (expected 2)
.
Does it says, in outputs, aux_outputs = model(inputs)
, model(1, 2)
where 2 expected?
gist: https://gist.github.com/rajasekharponakala/80514484ea7a38dc444cdc244dc9f950
(same but just the loss modification for googlenet, script from pytorch/examples/imagenet/main.py)
There should be some problem with the yielding values of both sides. not sure ;/
outputs, aux_outputs = model(inputs)
For GoogLeNet, there are 2 aux branches. So we have to do this way:
aux1, aux2, output = model(inputs)
loss1 = criterion(outputs, target)
loss2 = criterion(aux1, target)
loss3 = criterion(aux2, target)
loss = loss1 + 0.3*(loss2+loss3)
For Inception v3, it has only one aux branch.
outputs, aux_outputs = model(inputs)
loss1 = criterion(outputs, target)
loss2 = criterion(aux_outputs, target)
loss = loss1 + 0.4*loss2
Now, it’s working!. Thanks for the followup.
2 Likes
Thanks for the information and sorry for missing that you are using GoogleNet and not Inception_v3.
I’m glad you figured it out!
1 Like
Yeah, I haven’t noticed too, someone from the github/pytorch/issues on the same topic has figured it out
@ptrblck
Hi!
This time, there is little confusion with the fc
layer. I followed the finetune tutorial (just want to run with aux_logits=True
): for inception
as there is only one aux_logit
below snippet working fine.
elif model_name == "inception":
""" Inception v3
Be careful, expects (299,299) sized images and has auxiliary output
"""
model_ft = models.inception_v3(pretrained=use_pretrained)
set_parameter_requires_grad(model_ft, feature_extract)
# Handle the auxilary net
num_ftrs = model_ft.AuxLogits.fc.in_features
model_ft.AuxLogits.fc = nn.Linear(num_ftrs, num_classes)
# Handle the primary net
num_ftrs = model_ft.fc.in_features
model_ft.fc = nn.Linear(num_ftrs,num_classes)
input_size = 299
correspoing inception_v3
net file snippet:
if self.training and self.aux_logits:
aux = self.AuxLogits(x)
and the fc
snippet:
self.fc = nn.Linear(768, num_classes)
Whereas for GoogLeNet
has two auxilary outputs, the net file snippet has:
if self.training and self.aux_logits:
aux1 = self.aux1(x)
.....
if self.training and self.aux_logits:
aux2 = self.aux2(x)
and the fc
snippets:
self.fc1 = nn.Linear(2048, 1024)
self.fc2 = nn.Linear(1024, num_classes)
Now, my confusion is about using the fc
in finetuning script, how to embed?
num_ftrs = model_ft.(aux1/aux2).(fc1/fc2).in_features
model_ft.(aux1/aux2).(fc1/fc2) = nn.Linear(num_ftrs, num_classes)
any thoughts?