How to load part of pre-trained model?

I have used pre-trained Bert model and added some linear layers for classification and trained. I have saved the model weights. Now I want to use the model weights of bert only, not those classification layers. Because i want the bert model output which will be 768 dimension by using trained weights as if i use classification layers it will reduced to 2 dimension.

This is model architecture that i used for training.

class BertClassifier(nn.Module):

    def __init__(self, dropout=0.1):

        super(BertClassifier, self).__init__()

        self.bert = BertModel.from_pretrained('bert-base-uncased')
        self.dropout = nn.Dropout(dropout)
        self.linear = nn.Linear(768,1)
        self.relu = nn.Sigmoid()

    def forward(self, b_input_ids, b_input_mask):

        _, pooled_output = self.bert(input_ids= b_input_ids, attention_mask=b_input_mask,return_dict=False)
        dropout_output = self.dropout(pooled_output)
        linear_output = self.linear(dropout_output)
        final_layer = self.relu(linear_output)

return final_layer

I used state_dict method to save the weights, before saving i deleted the weights of classification layers and saved it.

Now i created a new model as:

from transformers import AutoConfig
config = AutoConfig.from_pretrained('bert-base-uncased')
model =  AutoModel.from_config(config)

Used AutoConfig because to load the model without any pretrained weights so that i can load my trained model weights.

And i am trying to load the weights into this new model but i am getting Runtime Error Keys Missing and lot of ids or keys. I also printed the keys of trained model and new model they are not same.

I tried several ways but none of them worked. I think the keys are not being same for the new model as my saved model is problem. should i define new model in another way.I don’t know what to do.

OR Should i just save the whole model and load the model and remove those classification layers so that i can send my input under torch.no_grad to model then maybe i will get the output of bert model with 768 dimension by using trained weights. I am confused to do what and how.

APPRECIATE YOUR RESPONSE OR HELP! THANK YOU.