How to implement customized layer with second order derivatives

I implement a customized CNN layer by using the col2im and im2col using a third-party library pyinn. Basically my CNN layers are non-share-weight layers.

When implementing the gradient penalty from WGAN-GP, where I need to compute the second order gradients, my program throw RuntimeError when doing gp.backward()

Im2ColLegacyBackward is not differentiable twice

This is the gradient penalty:

gradients = autograd.grad(outputs=disc_interpolates, inputs=interpolates,                grad_outputs=torch.ones(disc_interpolates.size()).cuda() if use_cuda else torch.ones(
                              create_graph=True, retain_graph=None, only_inputs=True)[0]
gradient_penalty = ((gradients.norm(2, dim=1) - 1) ** 2).mean() * LAMBDA

As far as I learn from other topics in the forum, Pytorch support higher order derivatives since 0.2, my version is 0.3.1. So I think the problem comes from the gradient computation of the third party pyinn. Can anyone tell me how to solve this, or any workaround?

Best Regards,