Your code has a small issue with the shapes.

If you are using `nn.MSELoss`

the shapes of your output and target should be the same.

However, in your current code the target is missing dim1, which will trigger some unwanted broadcasting inside the loss function.

```
train_x = torch.stack((torch.linspace(0, 10, 1000), torch.linspace(0.5, 10.5, 1000), torch.linspace(3, 14, 1000), torch.linspace(7.5, 19.5, 1000)),1)
train_y = (train_x[:,3]**2 - train_x[:,2])/10 + torch.cos(train_x[:,0] * (2 * math.pi)) + torch.sin(train_x[:,1] * (2 * math.pi)) + 0.1 * torch.randn(1000)
train_y.unsqueeze_(1)
print(train_x.size())
print(train_y.size())
```

After fixing this issue, I played around with your code and your model is able to learn an approximation of the curve using:

```
torch.manual_seed(2809)
class NN_Pred(nn.Module):
def __init__(self):
super(NN_Pred, self).__init__()
self.base = nn.Sequential(
nn.Linear(4, 64),
nn.ReLU(),
nn.Linear(64, 2),
nn.ReLU()
)
self.base_linear = nn.Linear(2, 1)
self.train()
def forward(self, inputs):
x = inputs
hidden_base = self.base(x)
return self.base_linear(hidden_base)
```

The model learned the approximation also using your current architecture, but was quite sensible to the random seed (a lot of solutions just learned the constant horizontal line). This modification is also not very robust, but might be a good starter to further experiment with your code.