I am currently trying to transition some code from Sci-kit Learn to GPyTorch but I am having difficulty achieving the same accuracy as well as the same computatioanl time.
My data has N=300 observations with D=6 input variables and M=100 output variables, therefore I have a NxD set of inputs and NxM outputs. In Sci-kit Learn I can train a GP to handle this multioutput problem with relative ease using the follow code:
model = GaussianProcessRegressor(kernel=Matern(length_scale=6*[1],nu=2.5))
model.fit(X_train, Y_train)
and I have tried to replicate this in GPyTorch with the following code:
class MultitaskGPModel(gpytorch.models.ExactGP):
def __init__(self, train_x, train_y, likelihood):
super(MultitaskGPModel, self).__init__(train_x, train_y, likelihood)
self.mean_module = gpytorch.means.MultitaskMean(
gpytorch.means.ConstantMean(), num_tasks=100
)
self.covar_module = gpytorch.kernels.MultitaskKernel(
gpytorch.kernels.MaternKernel(nu=2.5, nard_num_dims=6), num_tasks=100,
)
def forward(self, x):
mean_x = self.mean_module(x)
covar_x = self.covar_module(x)
return gpytorch.distributions.MultitaskMultivariateNormal(mean_x, covar_x)
likelihood = gpytorch.likelihoods.MultitaskGaussianLikelihood(num_tasks=100)
model = MultitaskGPModel(X_train, Y_train, likelihood)
Ideally, I would want the kernel and mean to be shared across the independent GPs, which I believe is how this is constructed in Sci-kit Learn. However, the GPyTorch model takes significantly longer to train, does not reach the same level of accuracy and additionally crashes when in eval mode and making predictions.
I would greatly appreciate any help replicating the Sci-kit Learn model in GPyTorch.