[Caffe2] Transfer learning of squeezenet - retrain last layer on GPU

TheMaikXX · February 5, 2019, 11:08am

Hello,
I’m trying to retrain last layer of Squeezenet model downloaded from caffe2’s github model-zoo repo. Now I’m in phase, where I can retrain this model using CPU… (mainly by following this tutorial… but without GPU… Now I want to move on and train the model on GPU. There are some issues I have come across… First, I was not able to set up DeviceOption (from caffe2_pb2 namespace) becouse
caffe2_pb2.DeviceOption doesnt have cuda_gpu_id prop (I found in pytorch/caffe2/proto/caffe2.proto PB definition that DeviceOption has device_id) but I later then switched to core.DeviceOption but I dont know if its wrong or not… But after this another Error showed up… (or rather some errors were randomly showing). 1. error is problem with protobuf structure of model.net… CHECK failed: (index) < (current_size_) or Segmentation fault (core dumped)

I have been playing with use of with core.DeviceScope statements which were wrapping NetDef's initialization and last layer’s reinitialization and I tried to insert DeviceOption to NetDef.device_option.CopyFrom and all that in combination with model.RunAllOnGpu.
But with no luck.

Is there any recommended aproach for this?

Thanks in advance.

Code:

# classCount - number of labels
def LoadAndTranslateSqueezenetModelv2(name,
	lmdbPath, classCount, batchSize, imageDimension,
	initNetPath, predictNetPath, deviceOpts,
	learningRate=10**-2):

	# with core.DeviceScope(deviceOpts):
	model = model_helper.ModelHelper(name, arg_scope={
		'order': 'NCHW',
		'use_cudnn': True
	})


	predNetPb = caffe2_pb2.NetDef()
	with open(predictNetPath, 'rb') as f:
		predNetPb.ParseFromString(f.read())
	
	initNetPb = caffe2_pb2.NetDef()
	with open(initNetPath, 'rb') as f:
		initNetPb.ParseFromString(f.read())

	# model.RunAllOnGPU()
	for op in initNetPb.op:
		if op.output[0] in ['conv10_w', 'conv10_b']:
			tag = (ParameterTags.WEIGHT if op.output[0].endswith('_w') else ParameterTags.BIAS)
			# create params inside model
			model.create_param(op.output[0], op.arg[0], initializers.ExternalInitializer(), tags=tag)
	
	# remove conv10_w and conv10_b ops from protobuf - ids -> 50,51
	# these ops were added to the model in for loop above (cannot add them again)
	initNetPb.op.pop(50)
	initNetPb.op.pop(50)
	
	model.param_init_net = core.Net(initNetPb)
	model.param_init_net.XavierFill([], 'conv10_w', shape=[classCount, 512, 1, 1])
	model.param_init_net.ConstantFill([], 'conv10_b', shape=[classCount])

	model.net = core.Net(predNetPb)
	model.Squeeze("softmaxout", "softmax", dims=[2, 3])

	# creates x-entropy, avarage-loss, builds sgd for every param of model
	ScaffoldModelTrainingOperatorsSqueezenet(model, 'softmax', 'label', 0.1)

	# lines like model.net.Proto().device_option.CopyFrom and reassigning it back to model..
	# for param_init_net and net
	# InscribeDeviceOptionsToModel(model, deviceOpts)

	return model