Use Pre-Trained CNN for State Farm Kaggle Competition

shahin0099 · January 2, 2020, 3:28am

Use pre-trained CNN for the State Farm Kaggle competition
A workbook with fast.ai lessons 3-4
if you want to read about free video downloader like vidmate apk click here
With a small train data size and sufficient training time, you can always adjust the train data to 100% accuracy.

Apply the VGG16 model with finetune default of fast.ai on distracted pilot detection competition data from Kaggle State Farm, we use the following parameters:

batch_size = 60
epochs = 200
train data size = 100 images belonging to 10 classes
size of validation data = 50 images belonging to 10 classes
Weight decrease = 0.0
Learning rate = 0.001
Optimizer = Adam
Decrease in learning rate = 0.0
Abandonment = 0.5 on two fully connected layers (4096 neurons, activation = ReLu)
No batch standardization
Around the 180 epochs, you can see the line accuracy = 1 go through a red dot, which means reaching an exact accuracy of 100% for training data.

However, it is clear that we have an overcapacity, the validation accuracy is only 0.5.

To estimate the cost in time with training in a complete data set and a VGG16 model (except that the last layer is replaced and that only the dense layers are trainable), I apply the following parameters:

batch_size = 60
train data size = 17943 images belonging to 10 classes (take 20% of the original train data as validation)
size of validation data = 4481 images belonging to 10 classes
Weight decrease = 0.0
Learning rate = 0.001
Optimizer = Adam
Decrease in learning rate = 0.0
Abandonment = 0.5 on two fully connected layers (4096 neurons, activation = ReLu)
No batch standardization
It takes about 10 minutes to train the neural network at a time!

So if I have to check 10 learning rates, 10 weight decreases, 3 dropout rates, 3 momentum (or decreasing learning rate), 3 lot sizes, 2 optimizers, it will take about (assuming that 'no grid search here) (10 + 10 + 3 + 3 + 3 + 2) * 10 = 310 minutes ~ 5 hours for a single period.

As a general rule, we would like to run each parameter longer, for example 10 epochs, which means 50 hours of training! And 10 eras are probably not long enough!

It’s too long for me to find the optimal settings. I need to use less data in less time to find a meaningful approximation of the parameters.

In addition, there are 79,726 images under test, it takes about 30 minutes to generate predictions for them.

Anyway, here is the graph of the accuracy compared to the times for this race. We can say that the validation precision is better than the train precision, this could indicate an under-adjustment because we applied the stall.

By submitting this model to Kaggle, I got a score of 1.55869, ranking around 675 in Kaggle’s private ranking.

While there are a few ideas we can try to improve the validation error (for example, reduce regularization), it’s best to follow a systematic process to tackle the problem.

Since we are only forming dense layers here, we should probably remove the convolutional layers to speed up the formation.

To save time in training and fine-tuning the hyper-parameters, I tried to use the pre-formed VGG16 model here.

First, load the original VGG16 model
Then delete all the layers after the last flattened part of the convolutional layer, i.e. after this layer
_________________________________________________________________flatten_2 (Flatten) (None, 25088) 0 ================================================= == ==========================
I then made predictions on train data, validation data and test data with the scale model. The resulting characteristics (shape: 25088x1 for each entry) are the entries of my new model (see below) on the photos of State Farm.
I then created a minimal network fully connected with 1 hidden dense layer (which has only one neuron) and with softmax as the output layer.
from keras.models import Sequentialfrom keras.layers.core import Flatten, Dense, Dropout, Lambdafrom keras.layers import BatchNormalizationfrom keras.optimizers import SGD, RMSprop, Adamlr = 0.001statefarm_model = Sequential () statefarm_model.add ‘relu’, input_shape = (25088,))) statefarm_model.add (Dense (10, activation = ‘softmax’)) statefarm_model.compile (optimizer = Adam (lr = lr), loss = ‘categorical_crossentropy’, metrics = [’ precision ']) statefarm_model.summary () _________________________________________________________________ Layer (type) Param

ptrblck · January 2, 2020, 9:39am

I’m not sure, if you posted some lecture notes or if there is a particular question you would like to discuss.
Could you explain what you try to achieve and where you are stuck?