Sketch-RNN with pytorch

I am implementing the sequence-to-sequence model introduced by this paper from Google Magenta:

My code is here:

As I don’t have Google’s power of computation, I changed a bit some hyperparameters of the papers (the number of nodes in the decoder LSTM and the learning rate) and it takes me half a day to run 10.000 epochs. However, according to the default hyperparameters suggested by the autor’s TF implementation, 512 neurons in decoder LSTM and lr = 0.0001 should be ok to get good results.

But, at least after 40.000 epochs, it still generates random straight strokes and even the reconstruction (giving true inputs from the training set to the decoder) is similar to the drawing but not satisfying. Something must be wrong somewhere… If any of you is interested by state-of-the-art seq2seq drawing generation, I would be glad to receive some help.

I really need to make it work, its for a robotic project (making collaborative child-robot hand-drawing)

2 Likes

Good niews !

My implementation was good since the beginning, but I had a terrible and invisible typo in the trajectory generation:

-        next_state[q_idx+1] = 1
+       next_state[q_idx+2] = 1 

Now it works properly, and I obtain nice samples of generated cats after short training (this one after 1900 epochs, ~3 epoch/s):

3 Likes

man, one tiny typo can change the performance so much :slight_smile:

3 Likes

This is great, I’m happy you were able to port sketch-rnn over to pytorch.

I’d also recommend using LSTM along with Recurrent Dropout without Memory Loss, which involves a one-line change of code in LSTM, and also Layer Normalization. I found these two tricks helped a lot for LSTMs.

3 Likes

Thanks a lot !

Yes, using recurrent dropout and layer normalization is on my TODO list. It seems to bet not yet implemented on Pytorch’s RNN modules, I will probably have to do my own.

Today I was busy finding a way to use the network to find and draw cats into clouds (with Canny edge detection):

Finaly, we used a Baxter robot to draw a cat inferred from the head of a Nao robot. And this led us to realize this creepy video:

2 Likes

These cloudy cats are great! It’s such a creative use of edge detection. I like the clouds more than nao’s head tbh.

Looking forward to see any gallery or work you end up producing!

1 Like

Hahahaha yes, the result with Nao’s head is really ugly, that explains the style of the video!

But the project will be more serious, involving children with handwriting difficulties for creative activities with robots. We will probably obtain beautiful piece of child-robot collaborative arts!

Hi I was looking at your code and I tried to run the cod on Jupyter notebook.

Still it seems to me that it doesn’t operate because at this part

if __name__=="__main__":
    model = Model()
    for epoch in range(50001):
        model.train(epoch)

I have an error saying

epoch 0 loss 2.6120107173919678 LR 2.6110057830810547 LKL 0.0010049500269815326
<ipython-input-17-a948bd699ee2>:36: UserWarning: Implicit dimension choice for softmax has been deprecated. Change the call to include dim=X as an argument.
  pi = F.softmax(pi.transpose(0,1).squeeze()).view(len_out,-1,hp.M)
<ipython-input-17-a948bd699ee2>:42: UserWarning: Implicit dimension choice for softmax has been deprecated. Change the call to include dim=X as an argument.
  q = F.softmax(params_pen).view(len_out,-1,3)
<ipython-input-31-a793444e3f37>:66: UserWarning: torch.nn.utils.clip_grad_norm is now deprecated in favor of torch.nn.utils.clip_grad_norm_.
  nn.utils.clip_grad_norm(self.encoder.parameters(), hp.grad_clip)
<ipython-input-31-a793444e3f37>:67: UserWarning: torch.nn.utils.clip_grad_norm is now deprecated in favor of torch.nn.utils.clip_grad_norm_.
  nn.utils.clip_grad_norm(self.decoder.parameters(), hp.grad_clip)
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-33-1aab3c2e5ccc> in <module>
      2     model = Model()
      3     for epoch in range(50001):
----> 4         model.train(epoch)

<ipython-input-31-a793444e3f37> in train(self, epoch)
     78         if epoch%100==0:
     79             #self.save(epoch)
---> 80             self.conditional_generation(epoch)
     81 
     82     def bivariate_normal_pdf(self, dx, dy):

<ipython-input-31-a793444e3f37> in conditional_generation(self, epoch)
    142             hidden_cell = (hidden, cell)
    143             # sample from parameters:
--> 144             s, dx, dy, pen_down, eos = self.sample_next_state()
    145             #------
    146             seq_x.append(dx)

<ipython-input-31-a793444e3f37> in sample_next_state(self)
    180         sigma_y = self.sigma_y.data[0,0,pi_idx]
    181         rho_xy = self.rho_xy.data[0,0,pi_idx]
--> 182         x,y = sample_bivariate_normal(mu_x,mu_y,sigma_x,sigma_y,rho_xy,greedy=False)
    183         next_state = torch.zeros(5)
    184         next_state[0] = x

<ipython-input-32-7b287d68c95c> in sample_bivariate_normal(mu_x, mu_y, sigma_x, sigma_y, rho_xy, greedy)
      8     cov = [[sigma_x * sigma_x, rho_xy * sigma_x * sigma_y],\
      9         [rho_xy * sigma_x * sigma_y, sigma_y * sigma_y]]
---> 10     x = np.random.multivariate_normal(mean, cov, 1)
     11     return x[0][0], x[0][1]
     12 

mtrand.pyx in numpy.random.mtrand.RandomState.multivariate_normal()

TypeError: ufunc 'add' output (typecode 'O') could not be coerced to provided output parameter (typecode 'd') according to the casting rule ''same_kind''

Would you mind if you help me out with this?