Hello,guys~~I can not get the grad of the input of my net. I have tried a lot of solutions provided in the forum, but it can not work. Following is part of my code, and anybody can help? Thanks a lot!

The issue is that you have some non-differentiable operations.
More generally. You should never need to call .requires_grad_() unless for the Tensor for which you want .grad to be populated.

Here is updated code with comments

# If you want to break the graph, use `.detach()`
X_emb = X_emb.detach()
# Move to the right device
X_emb = X_emb.to(device)
# Tensor still a leaf as it does not require gradients. Make it now
X_emb.requires_grad_()
# No need for retain_grad() as it is already a leaf
T = opt.T
epsilon = opt.epsilon
net.zero_grad()
#y_hat = net(X,seq_lengths)
y_hat = net_rest1(net,X_emb,seq_lengths)
# y_hat.requires_grad_() don't add extra requires_grad they are never needed
, pred_idx = torch.max(y_hat.data,1)
labels = pred_idx
y_hat = y_hat / T
print(y_hat.requires_grad)
loss = xent(y_hat,labels)
# loss.requires_grad_()
loss.backward()
print(type(X_emb))
print(X_emb.grad)

If a Tensor in the middle does not require gradients even though the input to the op does, that means the op is not differentiable and so no gradient can flow back.