Unable to backpropagate the network with custom layer

ZeeshanH · June 22, 2020, 12:49pm

My network

class Model(torch.nn.Module):
    def __init__(self):
        super(Model,self).__init__()
        self.mel = Spectrogram.MelSpectrogram(n_fft=512, n_mels=hp.data.nmels, sr=16000, hop_length=160, trainable_mel=True, trainable_STFT=True, device=device) # Using device='cuda:0'
#         torch.nn.init.xavier_uniform_(self.mel.mel_basis)
        self.LSTM_stack = nn.LSTM( hp.data.nmels ,hp.model.hidden , num_layers=hp.model.num_layer, batch_first=True).cuda(device)
        for name, param in self.LSTM_stack.named_parameters():
          if 'bias' in name:
             nn.init.constant_(param, 0.0)
          elif 'weight' in name:
             nn.init.xavier_normal_(param)
        self.CNN_layer1 = torch.nn.Conv2d(161,1, kernel_size=(1,1)).cuda(device)
        self.projection = nn.Linear(hp.model.hidden, hp.model.proj).cuda(device)
        self.clustering=spectral_clusterer.CustomClusteringLayer()

    def forward(self, x,y):
        x = self.mel(x)
        x = x.view(-1,x.data.size()[2],x.data.size()[1])
        x,_=self.LSTM_stack(x)
        x = x.view(-1,x.data.size()[1],x.data.size()[2],1)
        x=self.CNN_layer1(x)
        x=x.squeeze()
        x=self.projection(x)
        x = x / torch.norm(x, dim=1).unsqueeze(1)
        # x=x.squeeze()
        x=self.clustering(x,y)
        return x

spectral_clusterer.CustomClusteringLayer() has no parameter to learn it performs certain computation. feed forward is okay but usually get error on loss.backward()

This custom layer is defined as.

class SpectralClusterer(torch.autograd.Function):
      
    @staticmethod
    def forward(ctx,X,clusters):
            gaussian_blur_sigma=1
            p_percentile=0.95
            thresholding_soft_multiplier=0.01
            stop_eigenvalue=1e-2

            if name == "CropDiagonal":
                return refinement.CropDiagonal_torch()
            elif name == "GaussianBlur":
                return refinement.GaussianBlur_torch(gaussian_blur_sigma)
            elif name == "RowWiseThreshold":
                return refinement.RowWiseThreshold_torch(
                    p_percentile,
                    thresholding_soft_multiplier)
            elif name == "Symmetrize":
                return refinement.Symmetrize_torch()
            elif name == "Diffuse":
                return refinement.Diffuse_torch()
            elif name == "RowWiseNormalize":
                return refinement.RowWiseNormalize_torch()
            else:
                raise ValueError("Unknown refinement operation: {}".format(name))
        k=len(clusters)
        if not torch.is_tensor(X):
            raise TypeError("X must be a torch tensor")
        if len(X.shape) != 2:
            raise ValueError("X must be 2-dimensional")
        
        affinity = utils.compute_affinity_matrix_torch(X)
        refinement_sequence=DEFAULT_REFINEMENT_SEQUENCE
        
        for refinement_name in refinement_sequence:
            op = _get_refinement_operator(refinement_name)
            affinity = op.refine(affinity)
        
        cluster_ids_x, cluster_centers = KMeans(X=affinity, num_clusters=k, distance='euclidean', device=torch.device('cuda:0'))
        ctx.save_for_backward(cluster_ids_x,clusters)
        return cluster_ids_x
     
class CustomClusteringLayer(nn.Module):
    def __init__(self,name=""):
        super(CustomClusteringLayer, self).__init__()
        self.name=name
    def forward(self,x,n):
         return SpectralClusterer.apply(x,n)

Is there any way to bypass backprop in this layer?

ptrblck · June 23, 2020, 3:06am

You would need to implement the backward method manually for this custom autograd.Function, as Autograd doesn’t know how the gradients should be passed through this layer.
This can be avoided if you could use pure PyTorch methods in the forward pass, but this doesn’t seem to be the case, as you are apparently using methods from other libraries.

This tutorial gives you an example of a custom function.

ZeeshanH · June 23, 2020, 7:52am

I followed the tutorial which you mentioned. Moreover all libraries which I am using are Pytorch based. Nothing is implemented which is not in torch. This layer does not have any Parameter to learn then What should I implement in Backward function? I tried returning None in backward but it didnt solve the error.

ptrblck · June 23, 2020, 8:21am

If all operations in the forward method are differentiable, you could create a custom nn.Module and let Autograd create the backward pass.
autograd.Functions are used for custom forward and backward implementations.

ZeeshanH · June 24, 2020, 2:21pm

Where can I find list of operators which dont break differentiability of torch Graph?

ptrblck · June 25, 2020, 5:22am

The majority of PyTorch operations is differentiable. I’m not sure if there is a list of non-differentiable operations, but in doubt you could check the .grad_fn of the output.
If it points to a valid function, then Autograd will be able to propely backpropagate through it.

Note that recreating new tensors (e.g. via x = torch.tensor(x)) will break the computation graph.