I would like to calculate the gradient map of an image, which is the difference between adjacent pixels. I need to use the gradient maps as loss functions for back propagation to update network parameters, like TV Loss used in style transfer. Now I am confused about two implementation methods on the Internet.
The first is:
import torch
import torch.nn.functional as Fdef gradient_1order(x,h_x=None,w_x=None):
if h_x is None and w_x is None: h_x = x.size()[2] w_x = x.size()[3] r = F.pad(x, (0, 1, 0, 0))[:, :, :, 1:] l = F.pad(x, (1, 0, 0, 0))[:, :, :, :w_x] t = F.pad(x, (0, 0, 1, 0))[:, :, :h_x, :] b = F.pad(x, (0, 0, 0, 1))[:, :, 1:, :] xgrad = torch.pow(torch.pow((r - l) * 0.5, 2) + torch.pow((t - b) * 0.5, 2), 0.5) return xgrad
While the second is :
import torch
import torch.nn as nn
import torch.nn.functional as F
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
class Gradient_Net(nn.Module):
def __init__(self):
super(Gradient_Net, self).__init__()
kernel_x = [[-1., 0., 1.], [-2., 0., 2.], [-1., 0., 1.]]
kernel_x = torch.FloatTensor(kernel_x).unsqueeze(0).unsqueeze(0).to(device)
kernel_y = [[-1., 0., 1.], [-2., 0., 2.], [-1., 0., 1.]]
kernel_y = torch.FloatTensor(kernel_y).unsqueeze(0).unsqueeze(0).to(device)
self.weight_x = nn.Parameter(data=kernel_x, requires_grad=False)
self.weight_y = nn.Parameter(data=kernel_y, requires_grad=False)
def forward(self, x):
grad_x = F.conv2d(x, self.weight_x)
grad_y = F.conv2d(x, self.weight_y)
gradient = torch.abs(grad_x) + torch.abs(grad_y)
return gradient
Someone argued that the first one may lead to gradient explosion and the network can not be trained successfully. So I wonder which one should be used as a loss function? is there any better implementation?