Advice: Improving a Basic Machine Learning Model

datadog · December 11, 2018, 2:55pm

I have a machine learning model designed as follows:

	model = nn.Sequential(OrderedDict([
	                                       ('fc1', nn.Linear(631, 500)),
	                                        #('drop1', nn.Dropout(0.5)),
	                                       ('relu1', nn.ReLU(inplace=True)),

	                                       #('fc1B', nn.Linear(1000, 2000)),
	                                        #('drop1', nn.Dropout(0.5)),
	                                       #('relu1B', nn.ReLU(inplace=True)),

	                                       #('fc1C', nn.Linear(2000, 400)),
	                                       # ('drop1C', nn.Dropout(0.5)),
	                                       #('relu1C', nn.ReLU(inplace=True)),


	                                       ('fc2', nn.Linear(500, 400)),
	                                        #('drop2', nn.Dropout(0.5)),
	                                       ('relu2', nn.ReLU(inplace=True)),

	                                       ('fc2B', nn.Linear(400, 300)),
	                                        #('drop2', nn.Dropout(0.5)),
	                                       ('relu2B', nn.ReLU(inplace=True)),


											('fc3', nn.Linear(300, 200)),
											#('drop3', nn.Dropout(0.5)),
											('relu3', nn.ReLU(inplace=True)),

											('fc4', nn.Linear(200, 100)),
											#('drop4', nn.Dropout(0.5)),
											('relu4', nn.ReLU(inplace=True)),

											('fc4B', nn.Linear(100, 50)),
											#('drop4', nn.Dropout(0.5)),
											('relu4B', nn.ReLU(inplace=True)),

											('fc5', nn.Linear(50, 25)),
											#('drop5', nn.Dropout(0.5)),
											('relu5', nn.ReLU(inplace=True)),

											('fc5B', nn.Linear(25, 12)),
											#('drop5', nn.Dropout(0.5)),
											('relu5B', nn.ReLU(inplace=True)),

	                                        ('fc6', nn.Linear(12, 3)),
	                                        ('output', nn.LogSoftmax(dim=1))]))

It’s a very basic model. My crieterion and optimizer are as follows:

criterion = nn.CrossEntropyLoss()

optimizer = optim.SGD(model.parameters(), lr = ext_learning_rate)

I have learning rates from .00025 (and smaller) to the unreasonable rate of 100. Still my results vary wildly. Some have accuracy rates of up to 82.5% and some have accuracy rates of 1.4%. The results vary wildly. This is not my first model as I have tried others with less layers, etc.

Any advice on how to improve this model? And a broader question, does anyone know how I learn to build better models generally through a more systematic approach?

ptrblck · December 11, 2018, 3:00pm

If you’re using nn.CrossEntropy you should remove the last nn.LogSoftmax from your model.
In case you would like to keep the nn.LogSoftmax layer, you would need to use nn.NLLLoss instead.

datadog · December 11, 2018, 3:31pm

I made that change but there is no real change in the performance of the model. Any advice is appreciated.

ptrblck · December 12, 2018, 3:51pm

In that case, I would remove some layers and start with a really simple baseline model (e.g. just using two layers).
If the training loss still doesn’t change at all, some other bug might be in your training procedure.
However, if the loss moves down, you can try to scale up the model layer by layer.

datadog · December 12, 2018, 4:15pm

This is a very good piece of advice. I’ll definitely take it. Maybe you should consider writing a blog post about improving NN models. Just some basics. I would read it. But thanks.