Autograd raise an error : NaNs encountered when trying to perform matrix-vector multiplication

tritri · February 6, 2019, 4:11pm

Hi,

I’m using Gpytorch to implement a multi output regression, but i have an error when i try to use a Periodic kernel.

RuntimeError: NaNs encountered when trying to perform matrix-vector multiplication

It seems to work with the RBF kernel but as soon as i use the Periodic kernel, it doesn’t work anymore.

The definition of the class is the following :

class MultitaskGPModel(gpytorch.models.ExactGP):
    def __init__(self, train_x, train_y, likelihood):
        super(MultitaskGPModel, self).__init__(train_x, train_y, likelihood)

        self.mean_module = gpytorch.means.MultitaskMean(
            gpytorch.means.ConstantMean(),num_tasks=5)

        self.covar_module = gpytorch.kernels.MultitaskKernel(
                gpytorch.kernels.PeriodicKernel(),num_tasks=5, rank=1)

    def forward(self, x):
        mean_x = self.mean_module(x)
        covar_x = self.covar_module(x)
        return gpytorch.distributions.MultitaskMultivariateNormal(mean_x, covar_x)

likelihood = gpytorch.likelihoods.MultitaskGaussianLikelihood(num_tasks=5)
model = MultitaskGPModel(x_train, y_train, likelihood)

and my for loop is :

    for i in range(n_iter):
        optimizer.zero_grad()
        output = model(x_train)
        loss = -mll(output, y_train)
        loss.backward()
        print('Iter %d/%d - Loss: %.3f' % (i + 1, n_iter, loss.item()))
        optimizer.step()

The error occurs when i call optimizer.step(), and i think that it is related to the differentiation of the variables and to the require_grat attribute, so i have tried to enable the grad in the data like that :

x_train = torch.tensor(x_train, requires_grad=True).float()
y_train  = torch.tensor(y_train, requires_grad=True).float()
x_test = torch.tensor(x_test, requires_grad=True).float()
y_test  = torch.tensor(y_test, requires_grad=True).float()

But it hasn’t changed anything…

It would be really nice to have some help ! Thank you so much !

PS : Pytorch version 1.0

Kamer_Ali_Yuksel · April 5, 2019, 12:28pm

I have encountered with the same error at not the first but after successfully iterating over 12 batches. By the way, I rather suggest opening an issue at their Github repository, as they’re quite responsive.

class GPRegressionLayer(GP.models.AbstractVariationalGP):
    def __init__(self, inducing_points):
        variational_strategy = V.WhitenedVariationalStrategy(self, inducing_points,
            V.CholeskyVariationalDistribution(500), learn_inducing_locations=True)
        super(GPRegressionLayer, self).__init__(variational_strategy)
        self.mean_module = GP.means.ConstantMean()
        self.covar_module = GP.kernels.ScaleKernel(GP.kernels.RBFKernel())
    def forward(self, x):
        return GP.distributions.MultivariateNormal(
            self.mean_module(x), self.covar_module(x))

class DKLModel(GP.Module):
    def __init__(self, inducing_points, feature_extractor):
        super(DKLModel, self).__init__()
        self.feature_extractor = feature_extractor
        self.gp_layer = GPRegressionLayer(inducing_points)
    def forward(self, x):
        f, q, w, e = self.feature_extractor(x)
        return f[:,:-1], self.gp_layer(f[:,-1]), q, w, e