Pretrained VGG-Face model

pvskand · November 1, 2017, 4:02pm

I have searched for vgg-face pretrained model in pytorch, but couldn’t find it. Is there a github repo for the pretrained model of vgg-face in pytorch?

shashankvkt · July 27, 2018, 3:32pm

Hi! I hope it’s not too late.
I had found this link pertaining to details regarding vgg-face model along with its weights in the link below. Scroll down to the vgg-face section and download your requirements.

http://www.robots.ox.ac.uk/~albanie/pytorch-models.html

Hope this helps.

pvskand · July 27, 2018, 3:47pm

Thank you @shashankvkt.

cydonia999 · August 16, 2018, 5:34am

I hope this also helps.

shinx · October 25, 2019, 7:07am

Hi! I hope it’s not too late.
I had found this link pertaining to details regarding vgg-face model along with its weights in the link below. Scroll down to the vgg-face section and download your requirements.

http://www.robots.ox.ac.uk/~albanie/pytorch-models.html

Hope this helps.

Hi, is this loadable in the VGG16 torchvision model ?

shashankvkt · October 26, 2019, 4:03pm

I dont think its available as torchvision model. You still have to load the pretrained weights manually.

shinx · October 30, 2019, 1:02pm

I managed to load them manually thanks for your response.

shinx · November 7, 2019, 4:33pm

def compose_transforms(meta, resize=256, center_crop=True,
                       override_meta_imsize=False):
    """
    Compose preprocessing transforms for model
    The imported models use a range of different preprocessing options,
    depending on how they were originally trained. Models trained in MatConvNet
    typically require input images that have been scaled to [0,255], rather
    than the [0,1] range favoured by PyTorch.
    Args:
        meta (dict): model preprocessing requirements
        resize (int) [256]: resize the input image to this size
        center_crop (bool) [True]: whether to center crop the image
        override_meta_imsize (bool) [False]: if true, use the value of `resize`
           to select the image input size, rather than the properties contained
           in meta (this option only applies when center cropping is not used.
    Return:
        (transforms.Compose): Composition of preprocessing transforms
    """
    normalize = transforms.Normalize(mean=meta['mean'], std=meta['std'])
    im_size = meta['imageSize']
    assert im_size[0] == im_size[1], 'expected square image size'
    if center_crop:
        transform_list = [transforms.Resize(resize),
                          transforms.CenterCrop(size=(im_size[0], im_size[1]))]
    else:
        if override_meta_imsize:
            im_size = (resize, resize)
        transform_list = [transforms.Resize(size=(im_size[0], im_size[1]))]
    transform_list += [transforms.ToTensor()]
    if meta['std'] == [1, 1, 1]:  # common amongst mcn models
        transform_list += [lambda x: x * 255.0]
    transform_list.append(normalize)
    return transforms.Compose(transform_list)

The model is using the above preprocessing function. What are the transforms actually applied here ?

Also why is the following transform used ?

if meta['std'] == [1, 1, 1]:  # common amongst mcn models
    transform_list += [lambda x: x * 255.0]

apollack · March 13, 2020, 2:27pm

Hello,
I’ve been trying to use the ResNet50 model from the author site, and downloaded the model/weights. My main use of this is to pass in a batch of images and extract a feature map. I was able to do this successfully with the original pytorch implementation, however, when I do the same setup, I’m thrown an error because of the batch and because of the dimensions. I tried tracing back to see if I could adjust, but I’m unable to find anywhere to work on this. Any ideas? I’m initially using images of shape [B,C,H,W]=[8,3,256,256]. I know ResNet is supposed to be 224x224 but 256 was working, and needed for the size feature map I’m requiring.

It seems like part of the issue might be handling batches itself?

Here is the error:

Traceback (most recent call last):
  File "C:/Users/a-pollack/Projects/de_ident/de_ident.py", line 27, in <module>
    gen_losses, dis_losses, gen, dis = train_net(num_epochs=5, batch_size=8, lr=1e-4, betas=(0.5, 0.99))
  File "C:\Users\a-pollack\Projects\de_ident\train.py", line 122, in train_net
    z_raw, m, z_masked = gen(data[0])
  File "C:\Users\a-pollack\Anaconda3\lib\site-packages\torch\nn\modules\module.py", line 541, in __call__
    result = self.forward(*input, **kwargs)
  File "C:\Users\a-pollack\Projects\de_ident\generator.py", line 75, in forward
    face_descriptor = fd_conv2(x)
  File "C:\Users\a-pollack\Anaconda3\lib\site-packages\torch\nn\modules\module.py", line 541, in __call__
    result = self.forward(*input, **kwargs)
  File "C:\Users\a-pollack\Projects\de_ident\generator_utils.py", line 54, in forward
    x = self.features(x)
  File "C:\Users\a-pollack\Anaconda3\lib\site-packages\torch\nn\modules\module.py", line 541, in __call__
    result = self.forward(*input, **kwargs)
  File "C:\Users\a-pollack\Anaconda3\lib\site-packages\torch\nn\modules\container.py", line 92, in forward
    input = module(input)
  File "C:\Users\a-pollack\Anaconda3\lib\site-packages\torch\nn\modules\module.py", line 541, in __call__
    result = self.forward(*input, **kwargs)
  File "C:\Users\a-pollack\Anaconda3\lib\site-packages\torch\nn\modules\conv.py", line 345, in forward
    return self.conv2d_forward(input, self.weight)
  File "C:\Users\a-pollack\Anaconda3\lib\site-packages\torch\nn\modules\conv.py", line 342, in conv2d_forward
    self.padding, self.dilation, self.groups)
RuntimeError: Given groups=1, weight of size 256 64 1 1, expected input[8, 256, 64, 64] to have 64 channels, but got 256 channels instead

ptrblck · March 14, 2020, 3:39am

The error points to a wrong number of input channels to this particular convolution.
This might be caused by a wrong reshape or by mixing up layers.

maro95 · March 25, 2020, 12:11pm

Do we provide images in RGB or BGR format for VGGFace2 resnet50 model?

It doesn’t say anything except that model is trained using the Caffe framework and Caffe uses a BGR color channel scheme for reading image files. It also says that mean image vector is [131.0912, 103.8827, 91.4953] but if see values we can conclude that this mean image vector is in RGB format (there is more red than blue in face images). Does anybody know fo sure what is the truth?

shashankvkt · April 1, 2020, 12:15am

You are right. I use PIL to read images which is generally in RGB format. In case you use opencv to read images, please consider converting BGR to RGB.