Difference in Model Performance when using Validation set/ Testing set

I have implemented a PyTorch NN code for classification and regression.

Classification:
a) Use stratifiedKfolds for cross-validation (K=10- means 10 fold-cross validation)
I divided the data: as follows:
Suppose I have 100 data: 10 for testing, 18 for validation, 72 for training.

b) Loss function = CrossEntropy
c) Optimization = SGD
d) Early Stopping where waittime = 100 epochs.

Problem is:
Baseline Accuracy = 51%
Accuracy on Training set = 100%
Accuracy on validation set = 90%
Accuracy on testing set = 72%

I don’t understand what are the reasons behind the huge performance difference in Testing data/ Validation data?

How can I solve this problem?

Regression:
a) use the same network structure
b) loss function = MSELoss
c) Optimization = SGD
d) Early Stopping where waittime = 100 epochs.
e) Use K-fold for cross-validation.
I divided the data: as follows:
Suppose I have 100 data: 10 for testing, 18 for validation, 72 for training.

> Problem is:
Baseline MSE= 14.0
Accuracy on Training set = 0.0012
Accuracy on validation set = 6.45
Accuracy on testing set = 17.12

I don’t understand what are the reasons behind the huge performance difference in Testing data/ Validation data?

How can I solve these problems? or Is this an obvious thing for NN/ depend on particular dataset?

You might see some generalization error in your first example.
Chapter 40 of Ng’s Machine Learning Yearning might give you some more information of this Wikipedia article.