PyTorch tutorial for Neural transfert of artistic style


If someones are interested, I’ve realized this PyTorch tutorial to implement the neural transfer of artistic style developed by Leon Gatys and AL:

Any feedback is welcome!


That looks great! But does the code work? You have that # -- WRONG CODE -- comment, and it indeed looks incorrect to me. You can’t construct an optimizer with a single Variable input. It needs a list of Variables, so optim.Adam([input], lr = 0.01) should work.

1 Like

Yes, the code works and I give the whole script in the “.py”. I show the wrong code to explain the global idea, then I give the correct version just below (maybe not a good pedagogy…). But I didn’t know that just a list of Variable works, this is much simpler that what I did (I constructed a module with the variable as a parameter). Thanks !

When using a pretrained model ‘resnet18’ to train images with torch.optim.Adam, the system said:
Traceback (most recent call last):
File “”, line 1, in
File “/root/anaconda2/lib/python2.7/site-packages/torch/optim/”, line 54, in step
beta1, beta2 = group[‘betas’]
TypeError: ‘float’ object is not iterable

The test codes were written as follows:
momentum = 1e-4
weight_decay= 0.1
model = models.resnet18(pretrained=True)
optimizer = torch.optim.Adam(model.parameters(),lr,momentum,weight_decay)
output =model(torch.autograd.Variable(torch.ones(1,3,224,224)))
yt = torch.autograd.Variable((torch.ones(1)).long())
criterion = torch.nn.CrossEntropyLoss ()
loss = criterion(output, yt)

If using torch.optim.SGD instead of Adam, the codes work. However, I guessed Adam would give better performance in some cases, and thus tried to use Adam algorithm.

I have read the original codes and searched some related webpages like Unfortunately,
It is still confusing for me to correctly use the input parameter group of Adam optimizer in this situation.

Would you mind give an example of using the Adam optimizer? Thank you Adam O(∩_∩)O

@phenixcx you can look at

Third argument to constructor of optim.Adam is a tuple called betas. Optimizers have different constructors, you can find them in the docs.

Got it, thanks a lot for the reminders from @apaszke and @smth !
I missed the right constructor, though reading through available codes and specifications :cold_sweat:.

@apaszke, I was playing with this impl, however doesn’t seem to work at all using Python3.6 + Pytorch 0.1, with the following errors:

 Traceback (most recent call last):
  File "", line 205, in <module>
    style_score += sl.backward()
  File "", line 105, in backward
  File "/usr/local/lib/python3.6/site-packages/torch/autograd/", line 146, in backward
    self._execution_engine.run_backward((self,), (gradient,), retain_variables)
  File "/usr/local/lib/python3.6/site-packages/torch/nn/_functions/", line 48, in backward
    if self.needs_input_grad[0] else None)
  File "/usr/local/lib/python3.6/site-packages/torch/nn/_functions/", line 119, in _grad_input
    return self._thnn('grad_input', input, weight, grad_output)
  File "/usr/local/lib/python3.6/site-packages/torch/nn/_functions/", line 161, in _thnn
    return impl[fn_name](self, self._bufs[0], input, weight, *args)
  File "/usr/local/lib/python3.6/site-packages/torch/nn/_functions/", line 251, in call_grad_input
    grad_input, weight, *args)
RuntimeError: Need gradOutput of dimension 4 and gradOutput.size[1] == 64 but got gradOutput to be of shape: [64 x 2401] at /Users/soumith/code/pytorch-builder/wheel/pytorch-src/torch/lib/THNN/generic/SpatialConvolutionMM.c:50

Did you update to 0.1.10?

@ecolss, also I did the code using python2, I don’t know if it would work with python3.6.

Yes, I did.

@alexis-jacq mentioned, it was a python2 implementation, however I don’t think it is the problem here.

I thought the problem was that, the input is cloned and resized in GramMatrix module, and the style loss is then computed on it, and as the error occurs at the stage of style loss backward(), so would it be that, the grad of style loss over GramMatrix is a 2 dim tensor, and further the grad over the cloned and resized input is not properly computed?

@apaszke any suggestions to debug this?

@apaszke @alexis-jacq

Debug for a while, found the root cause of the error: -> Variable.resize().

replaced this line

1 Like

Thanks for having reported this issue.

The code is working on my computer but anyway, I did it quickly when I discovered Pytorch, so I am not surprised if it causes bugs on another system. It is fool of hacks and the implementation is not clean (as you can see here: How to extract features of an image from a trained model). I have to re-write it, I will do it as soon as I have time for myself.

@ecolss Wouldn’t data.view be more appropriate than data.resize in this case? The output tensor has the same size, just a different shape. I think PyTorch’s view is very similar to numpy’s reshape method.

Yes, it’s better to use .view

.view is cool, I just wasn’t aware of it before.
However, .resize also returns a view, doesn’t it? I mean any particular difference between the two?

.view is way way safer than .resize and there are hardly any cases when .resize should be used in user scripts. It will raise an error if you try to get a tensor with a different number of elements, or if it’s not contiguous (.resize can give you a tensor that views on a data that wasn’t used before).

@apaszke Noted, thanks

Hi Alexis, cool work!
I think you can get better results by using LBFGS though. You can check here for an implementation: