How to use a state_dict pth model to classify molecular image files?

juiocollm · March 15, 2021, 8:22pm

Using windows/anaconda/spyder, python 3.8 I managed to obtain an optimized trained model which classifies thousands of molecules according to their *.png images in binding to a protein: binding (1) and non-binding(0).

The optimized model is kept in a pth file entitled by the DEEPScreen program I used: “AB_best_val-AB-CNNModel1-512-256-0.001-32-0.5-50-VA2-state_dict.pth” or “VA2.pth” for short.

I would like to use the VA2.pth optimized model to infer the 0s or 1s of hundreds of thousands of individual molecular image files already designed and located at the imgs directory (imgs / molecule1.png, molecule2.png, molecule3.png, …).

The only thing I got was to see some of the inside of the VA2.pth (a large collection of numbers) by using the script
model = torch.load(“VA2.pth”)
print(model)

I tried to continue using other lines of code as suggested by the web and documentation ( I am new to PyTorch), but I could not get anything else to work !

Could somebody help me?

Dwight_Foster · March 15, 2021, 9:02pm

If you saved the state dict of the model you will still have to define the model before loading the state dict. That is not the complete model only the weights of each layer. So for example you will need to do this

class model():
   ##HERE YOU DEFINE THE MODEL ARCHITECTURE
model = model()
model.load_state_dict(torch.load("VA2.pth"))

juiocollm · March 16, 2021, 7:58am

Thank you Dwight.
I really appreciated.

I am not sure if I understood correctly.
The only model I used is in the models.py inside the DEEPScreen training program.
This *.py was the following :

import torch
import torch.nn as nn
import torch.nn.functional as F
from operator import itemgetter

class CNNModel1(nn.Module):
def init(self, fully_layer_1, fully_layer_2, drop_rate):
super(CNNModel1, self).init()

    self.conv1 = nn.Conv2d(3, 32, 2)
    self.bn1 = nn.BatchNorm2d(32)
    self.conv2 = nn.Conv2d(32, 64, 2)
    self.bn2 = nn.BatchNorm2d(64)
    self.conv3 = nn.Conv2d(64, 128, 2)
    self.bn3 = nn.BatchNorm2d(128)
    self.conv4 = nn.Conv2d(128, 64, 2)
    self.bn4 = nn.BatchNorm2d(64)
    self.conv5 = nn.Conv2d(64, 32, 2)
    self.bn5 = nn.BatchNorm2d(32)

    self.pool = nn.MaxPool2d(2, 2)
    self.drop_rate = drop_rate
    self.fc1 = nn.Linear(32*5*5, fully_layer_1)
    self.fc2 = nn.Linear(fully_layer_1, fully_layer_2)
    self.fc3 = nn.Linear(fully_layer_2, 2)

def forward(self, x):
    # print(x.shape)
    x = self.pool(F.relu(self.bn1(self.conv1(x))))
    # print(x.shape)
    x = self.pool(F.relu(self.bn2(self.conv2(x))))
    # print(x.shape)
    x = self.pool(F.relu(self.bn3(self.conv3(x))))
    # print(x.shape)
    x = self.pool(F.relu(self.bn4(self.conv4(x))))
    # print(x.shape)
    x = self.pool(F.relu(self.bn5(self.conv5(x))))
    # print(x.shape)

    x = x.view(-1, 32*5*5)
    x = F.dropout(F.relu(self.fc1(x)), self.drop_rate)
    x = F.dropout(F.relu(self.fc2(x)), self.drop_rate)
    x = self.fc3(x)

    return x

If that is the case, do you mean to use:

class CNNModel1(nn.Module):
…include the model architecture as above…
model=model()
model.load_state_dict(torch.load(“VA2.pth”))

Assuming that is Ok, next, I supposse I have to also to include some reference to the *.png imgs they are about to be inferred (they are all files into the same directory for convinience). How?

Thanks again
julio

Dwight_Foster · March 16, 2021, 1:14pm

Yes I mean use the CNNModel1 code in *.py to recreate your model and then you will load your state dict on to that model so like this

class CNNModel1(nn.Module):
code for defining the model here. So you need the init function and forward function with all the code from *.py

model = CNNModel1(PARAMETERS)
model.load_state_dict(torch.load(“VA2.pth”))

where parameters are the ones you used to train your model.

juiocollm · March 16, 2021, 4:39pm

I made the changes as you suggested and added at the end of the script the parameters I used for training:
model = CNNModel1(target_id=“AB”, fully_layer_1=512, fully_layer_2=256, learning_rate=0.001, batch_size=32, drop_rate=0.5, n_epoch=50, experiment_name=“VA2”)
model.load_state_dict(torch.load(‘VA2.pth’))

however I got a quick error reply on running just on the first line of the code ! as follows:

runfile(‘C:/- DOCKING/Macros/py/inouts/pthModels/pthModels.py’, wdir=‘C:/- DOCKING/Macros/py/inouts/pthModels’)
Traceback (most recent call last):

File “C:- DOCKING\Macros\py\inouts\pthModels\pthModels.py”, line 15, in
class CNNModel1(nn.Module):

TypeError: str() argument 2 must be str, not tuple

It surprised me because it is the same code line that is used by the training program!
what is going on?

Dwight_Foster · March 16, 2021, 4:46pm

I am not sure. Can you send the full code that you are using to load and define the model?

juiocollm · March 16, 2021, 5:05pm

this is my pthModels.py

-- coding: utf-8 --

“”"
Created on Mon Mar 15 17:00:06 2021

@author: JULIO
“”"
import torch
from torch.utils.data import Dataset
from torch.utils.data.sampler import SubsetRandomSampler, BatchSampler, SequentialSampler
import torch.nn as nn
import torch.nn.functional as F
import torch.optim as optim
from operator import itemgetter

class CNNModel1(nn.Module):
def init(self, fully_layer_1, fully_layer_2, drop_rate):
super(CNNModel1, self).init()

    self.conv1 = nn.Conv2d(3, 32, 2)
    self.bn1 = nn.BatchNorm2d(32)
    self.conv2 = nn.Conv2d(32, 64, 2)
    self.bn2 = nn.BatchNorm2d(64)
    self.conv3 = nn.Conv2d(64, 128, 2)
    self.bn3 = nn.BatchNorm2d(128)
    self.conv4 = nn.Conv2d(128, 64, 2)
    self.bn4 = nn.BatchNorm2d(64)
    self.conv5 = nn.Conv2d(64, 32, 2)
    self.bn5 = nn.BatchNorm2d(32)

    self.pool = nn.MaxPool2d(2, 2)
    self.drop_rate = drop_rate
    self.fc1 = nn.Linear(32*5*5, fully_layer_1)
    self.fc2 = nn.Linear(fully_layer_1, fully_layer_2)
    self.fc3 = nn.Linear(fully_layer_2, 2)

def forward(self, x):
    # print(x.shape)
    x = self.pool(F.relu(self.bn1(self.conv1(x))))
    # print(x.shape)
    x = self.pool(F.relu(self.bn2(self.conv2(x))))
    # print(x.shape)
    x = self.pool(F.relu(self.bn3(self.conv3(x))))
    # print(x.shape)
    x = self.pool(F.relu(self.bn4(self.conv4(x))))
    # print(x.shape)
    x = self.pool(F.relu(self.bn5(self.conv5(x))))
    # print(x.shape)

    x = x.view(-1, 32*5*5)
    x = F.dropout(F.relu(self.fc1(x)), self.drop_rate)
    x = F.dropout(F.relu(self.fc2(x)), self.drop_rate)
    x = self.fc3(x)

    return x

model = CNNModel1(target_id=“AB”, fully_layer_1=512, fully_layer_2=256, learning_rate=0.001, batch_size=32, drop_rate=0.5, n_epoch=50, experiment_name=“VA2”)
model.load_state_dict(torch.load(‘VA2.pth’))

Initialize model

#model = torch.load(“VA2.pth”)
#print(model)

Dwight_Foster · March 16, 2021, 5:07pm

Why are you passing so many things into the CNNModel1 it only needs 3 parameters. You don’t need the target id, learning rate, batch size, n epoch, and experiment name

juiocollm · March 16, 2021, 5:03pm

Sure, I attach the “pthModels.py” I am trying to develop
I hope this way it will get to you…
please let me know…

(Attachment pthModels.py is missing)

juiocollm · March 17, 2021, 7:49am

As suggested I removed the unrequired parameters. Those that were left were the fully_layer_1=512, fully_layer_2=256 and drop-rate=0.5.

Nevertheless the error log is still saying the same “song” :

File “C:- DOCKING\Macros\py\inouts\pthModels\pthModels.py”, line 15, in
class CNNModel1(nn.Module):

TypeError: str() argument 2 must be str, not tuple

What is argument 2 ?

Dwight_Foster · March 17, 2021, 12:06pm

Is that the full error? I am not sure what argument 2 is supposed to be.

juiocollm · March 17, 2021, 6:27pm

yes is the only error detected by spyder.

I tried:
class CNNModel1(str(nn.Module)):
but it says the same error

I read many different things about the “TypeError: str() argument 2 must be str, not tuple” but none of them refered to “nn.Module” or “CNN”…

juiocollm · March 17, 2021, 6:34pm

I also tried
class CNNModel1():

now the same error does not appeared but it was:

runfile(‘C:/- DOCKING/Macros/py/inouts/pthModels/pthModels.py’, wdir=‘C:/- DOCKING/Macros/py/inouts/pthModels’)
Traceback (most recent call last):

File “C:- DOCKING\Macros\py\inouts\pthModels\pthModels.py”, line 55, in
x=CNNModel1.forward()

TypeError: forward() missing 2 required positional arguments: ‘self’ and ‘x’

does that clarifies the problem?

Dwight_Foster · March 17, 2021, 9:04pm

That does actually you need to define the model first and then do the forward pass like this

model = CNNModel1(fully_layer_1, fully_layer_2, drop_rate)
model.load_state_dict(torch.load(‘VA2.pth’))
# and then the forward pass
output = model(input)

where you pass in the same parameters for your training into CNNModel1

juiocollm · March 18, 2021, 8:23am

I did as you suggested but I had to modify the head of the class to include
class CNNModel1():
def init(self, fully_layer_1=512, fully_layer_2=256, drop_rate=0.5):

Then I included at the end your lines of code and obtain
runfile(‘C:/- DOCKING/Macros/py/inouts/pthModels/pthModels.py’, wdir=‘C:/- DOCKING/Macros/py/inouts/pthModels’)
<main.CNNModel1 object at 0x000000522AA4D888>
Traceback (most recent call last):

File “C:- DOCKING\Macros\py\inouts\pthModels\pthModels.py”, line 60, in
model.load_state_dict(torch.load(‘VA2.pth’))

AttributeError: ‘CNNModel1’ object has no attribute ‘load_state_dict’

The most I read, the most I do not understand what I am doing, nor why !

Dwight_Foster · March 18, 2021, 1:01pm

Can you send me the code you use to load it. You need to define the model first like this

model = CNN1Model(PARAMETERS)

where the parameters are the needed for the model.

juiocollm · March 18, 2021, 4:18pm

Thank you Dwight for been still there.
This is my whole “best” code right now:

import torch
import torch.nn as nn
import torch.nn.functional as F
from torch.utils.data import Dataset
from torch.utils.data.sampler import SubsetRandomSampler, BatchSampler, SequentialSampler
import torch.optim as optim
from operator import itemgetter

class CNNModel1():#(nn.Module):
def init(self, fully_layer_1=512, fully_layer_2=256, drop_rate=0.5):
super(CNNModel1, self).init()

    self.conv1 = nn.Conv2d(3, 32, 2)
    self.bn1 = nn.BatchNorm2d(32)
    self.conv2 = nn.Conv2d(32, 64, 2)
    self.bn2 = nn.BatchNorm2d(64)
    self.conv3 = nn.Conv2d(64, 128, 2)
    self.bn3 = nn.BatchNorm2d(128)
    self.conv4 = nn.Conv2d(128, 64, 2)
    self.bn4 = nn.BatchNorm2d(64)
    self.conv5 = nn.Conv2d(64, 32, 2)
    self.bn5 = nn.BatchNorm2d(32)

    self.pool = nn.MaxPool2d(2, 2)
    self.drop_rate = drop_rate
    self.fc1 = nn.Linear(32*5*5, fully_layer_1)
    self.fc2 = nn.Linear(fully_layer_1, fully_layer_2)
    self.fc3 = nn.Linear(fully_layer_2, 2)

def forward(self, x):
    # print(x.shape)
    x = self.pool(F.relu(self.bn1(self.conv1(x))))
    # print(x.shape)
    x = self.pool(F.relu(self.bn2(self.conv2(x))))
    # print(x.shape)
    x = self.pool(F.relu(self.bn3(self.conv3(x))))
    # print(x.shape)
    x = self.pool(F.relu(self.bn4(self.conv4(x))))
    # print(x.shape)
    x = self.pool(F.relu(self.bn5(self.conv5(x))))
    # print(x.shape)

    x = x.view(-1, 32*5*5)
    x = F.dropout(F.relu(self.fc1(x)), self.drop_rate)
    x = F.dropout(F.relu(self.fc2(x)), self.drop_rate)
    x = self.fc3(x)

    return x

model = CNNModel1(fully_layer_1=512, fully_layer_2=256, drop_rate=0.5)
model.load_state_dict(torch.load(‘VA2.pth’))
output = model(input)

Terminal answer:--------------------------------------------------------------------------
runfile(‘C:/- DOCKING/Macros/py/inouts/pthModels/pthModels.py’, wdir=‘C:/- DOCKING/Macros/py/inouts/pthModels’)
Traceback (most recent call last):

File “C:- DOCKING\Macros\py\inouts\pthModels\pthModels.py”, line 57, in
model.load_state_dict(torch.load(‘VA2.pth’))

AttributeError: ‘CNNModel1’ object has no attribute ‘load_state_dict’

Dwight_Foster · March 18, 2021, 4:44pm

Oh you need to include the nn.Module in the class definition
like this

class CNNModel1(nn.Module):

hopefully that won’t give the error again.

juiocollm · March 18, 2021, 7:01pm

thank you!
however, when I did that, the old song came back again, sorry:

runfile(‘C:/- DOCKING/Macros/py/inouts/pthModels/pthModels.py’, wdir=‘C:/- DOCKING/Macros/py/inouts/pthModels’)
Traceback (most recent call last):

File “C:- DOCKING\Macros\py\inouts\pthModels\pthModels.py”, line 15, in
class CNNModel1(nn.Module):

TypeError: str() argument 2 must be str, not tuple

Dwight_Foster · March 18, 2021, 7:07pm

Ok I think I see your problem. I kept overlooking it but it is so simple. When you defined the model and init function you forgot to add the two underscores on each side like this:

class CNNModel1(nn.Module):  # (nn.Module):

    def __init__(self, fully_layer_1=512, fully_layer_2=256,  drop_rate-0.5):
        super(CNNModel1, self).__init__()

hopefully that will fix it.