Unable to backpropagate the network with custom layer

My network

class Model(torch.nn.Module):
    def __init__(self):
        self.mel = Spectrogram.MelSpectrogram(n_fft=512, n_mels=hp.data.nmels, sr=16000, hop_length=160, trainable_mel=True, trainable_STFT=True, device=device) # Using device='cuda:0'
#         torch.nn.init.xavier_uniform_(self.mel.mel_basis)
        self.LSTM_stack = nn.LSTM( hp.data.nmels ,hp.model.hidden , num_layers=hp.model.num_layer, batch_first=True).cuda(device)
        for name, param in self.LSTM_stack.named_parameters():
          if 'bias' in name:
             nn.init.constant_(param, 0.0)
          elif 'weight' in name:
        self.CNN_layer1 = torch.nn.Conv2d(161,1, kernel_size=(1,1)).cuda(device)
        self.projection = nn.Linear(hp.model.hidden, hp.model.proj).cuda(device)

    def forward(self, x,y):
        x = self.mel(x)
        x = x.view(-1,x.data.size()[2],x.data.size()[1])
        x = x.view(-1,x.data.size()[1],x.data.size()[2],1)
        x = x / torch.norm(x, dim=1).unsqueeze(1)
        # x=x.squeeze()
        return x

spectral_clusterer.CustomClusteringLayer() has no parameter to learn it performs certain computation. feed forward is okay but usually get error on loss.backward()

This custom layer is defined as.

class SpectralClusterer(torch.autograd.Function):
    def forward(ctx,X,clusters):

            if name == "CropDiagonal":
                return refinement.CropDiagonal_torch()
            elif name == "GaussianBlur":
                return refinement.GaussianBlur_torch(gaussian_blur_sigma)
            elif name == "RowWiseThreshold":
                return refinement.RowWiseThreshold_torch(
            elif name == "Symmetrize":
                return refinement.Symmetrize_torch()
            elif name == "Diffuse":
                return refinement.Diffuse_torch()
            elif name == "RowWiseNormalize":
                return refinement.RowWiseNormalize_torch()
                raise ValueError("Unknown refinement operation: {}".format(name))
        if not torch.is_tensor(X):
            raise TypeError("X must be a torch tensor")
        if len(X.shape) != 2:
            raise ValueError("X must be 2-dimensional")
        affinity = utils.compute_affinity_matrix_torch(X)
        for refinement_name in refinement_sequence:
            op = _get_refinement_operator(refinement_name)
            affinity = op.refine(affinity)
        cluster_ids_x, cluster_centers = KMeans(X=affinity, num_clusters=k, distance='euclidean', device=torch.device('cuda:0'))
        return cluster_ids_x
class CustomClusteringLayer(nn.Module):
    def __init__(self,name=""):
        super(CustomClusteringLayer, self).__init__()
    def forward(self,x,n):
         return SpectralClusterer.apply(x,n)

Is there any way to bypass backprop in this layer?

You would need to implement the backward method manually for this custom autograd.Function, as Autograd doesn’t know how the gradients should be passed through this layer.
This can be avoided if you could use pure PyTorch methods in the forward pass, but this doesn’t seem to be the case, as you are apparently using methods from other libraries.

This tutorial gives you an example of a custom function.

1 Like

I followed the tutorial which you mentioned. Moreover all libraries which I am using are Pytorch based. Nothing is implemented which is not in torch. This layer does not have any Parameter to learn then What should I implement in Backward function? I tried returning None in backward but it didnt solve the error.

If all operations in the forward method are differentiable, you could create a custom nn.Module and let Autograd create the backward pass.
autograd.Functions are used for custom forward and backward implementations.

Where can I find list of operators which dont break differentiability of torch Graph?

The majority of PyTorch operations is differentiable. I’m not sure if there is a list of non-differentiable operations, but in doubt you could check the .grad_fn of the output.
If it points to a valid function, then Autograd will be able to propely backpropagate through it.

Note that recreating new tensors (e.g. via x = torch.tensor(x)) will break the computation graph.