Performing evaluation on the test set

I have implemented the evaluation of the test set as follows:

n_epochs = 1000 
batch_size = 32 
loss_train=[]

for epoch in range(n_epochs):

    permutation1 = torch.randperm(trainX.size()[0])
    for i in range(0,trainX.size()[0], batch_size):

        optimizer.zero_grad()
        indices1 = permutation1[i:i+batch_size]
        batch_x_train, batch_y_train = trainX[indices1], trainY[indices1]

        model.train()
        outputs = model.forward(batch_x_train)
        train_loss = criterion(outputs,batch_y_train)
        train_loss.backward()
        optimizer.step()
        loss_train.append(train_loss.item())

    model.eval()
    y_pred = model(valX)
    val_loss = criterion(y_pred, valY)        

    avg_train_loss = sum(loss_train) / len(loss_train)
    print('epoch {},  train loss {},  val loss{}'.format(epoch, avg_train_loss, val_loss)) 

model.eval()
y_pred = model(testX)
test_loss = criterion(y_pred, testY) 
print('test loss is {}'.format(test_loss))

Is this the correct way to evaluate the model on the test set? Also, where and how should I save the model in this case ( torch.save() or model.state_dict() ) if in the future all I would want to do is to load the model and just use it on the test set?

Assuming valX is a tensor with the complete validation data, then this approach would be generally right, but you might of course run out of memory, if this tensor is too large.
The usual approach would be to wrap it in a Dataset and DataLoader and get the predictions for each batch. The data loading tutorial gives you some information how to create a Dataset and DataLoader.

Also, to save memory during evaluation and test, you could wrap the validation and test code into a with torch.no_grad() block.

Do you mean to say that for evaluation and test set the code should be:

with torch.no_grad():
    model.eval()
    y_pred = model(valX)
    val_loss = criterion(y_pred, valY)

and

with torch.no_grad():
    model.eval()
    y_pred = model(test)
    test_loss = criterion(y_pred, testY)

Also, how about the answer to the second part of the question where I ask about the best way to save the model?

The usage of no_grad() and model.eval() is correct and if you are not running our of memory using the complete datasets, your approach should work.

To store the model, you should save the state_dict() as described here.