Wrap model with extra hidden layer

I have a pre trained transformer model that gets called like so

outputs = model(input_ids, attention_mask=mask, token_type_ids=tokens)

I then take the outputs of that model

last_hidden_state = outputs.last_hidden_state

and perform some mathematical manipulations, such as the ones below

last_hidden_state_zero_layer = torch.mul(last_hidden_state, some_parameter)
logits = logit_layer(last_hidden_state_zero_layer)

My question is, is there a way I can take the model and wrap the model plus the extra functions to simply create one PyTorch model? pseudocode below

`model = pytorch_wrapper_to_create_one_model = (

outputs = model(input_ids, attention_mask=mask, token_type_ids=tokens),

last_hidden_state_zero_layer = torch.mul(outputs.last_hidden_state, some_parameter),

logits = logit_layer(last_hidden_state_zero_layer)

Here’s a code snippet,

class yourModel(nn.Module):
    def __init__(self, ...):
        self.backbone = previous_model(...)
        self.last_hidden_state_zero_layer = nn.Linear(...)
        self.logit_layer = nn.Linear(...)

    def forward(self, input):
        outputs = self.backbone(input, ...)
        last_hidden = self.last_hidden_state_zero_layer(outputs.last_hidden_state, some_parameter)
        logit = self.logit_layer(last_hidden)
        return logit

Define a class with components is a possible approach

Thanks, works like a charm.

So this did work, but now when I try and deploy this on sagemaker, it is unable to load the model. When I look at the type of the above model, it reads
type(yourModel) = __main__.Model
so I’m assuming this is because sagemaker doesn’t recognize this as a torch object. Do you know of a way to fix this?

Could you share the code? Sagemaker is not a special something but just a instance (computer)

This is how I instantiate the model

val_to_col_elec_model = PyTorchModel(#model_data='s3://sagemaker-nl-filter-suggest-two/fields/pytorchmodel.tar.gz',

and this is how I deploy

nl_detector = val_to_col_elec_model.deploy(
                     initial_instance_count = 1,
                     instance_type = 'ml.t2.medium', endpoint_name = endpoint_name)

When I upload the model created with this code

class Model(torch.nn.Module):
    def __init__(self, model, logit_layer):
        super(Model, self).__init__()
        self.model = model
        #self.last_hidden_state_zero_layer = nn.Linear(...)
        self.logit_layer = logit_layer

    def forward(self, input_ids, mask, tokens):
        pre_logits_mask = torch.reshape(mask, (mask.shape[0], mask.shape[1], 1) )
        outputs = self.model(input_ids, attention_mask=mask, token_type_ids=tokens)
        last_hidden_state = outputs.last_hidden_state
        last_hidden_state_zero_layer = torch.mul(last_hidden_state, pre_logits_mask)
        summed_final_hidden_state = torch.sum(last_hidden_state_zero_layer, 1)
        logits = self.logit_layer(summed_final_hidden_state)
        probs = torch.sigmoid(logits)
        return probs
entire_model = Model(model, logit_layer)

It runs into a model not loading error. again I believe this is because it is of type main.Model, because the sagemaker container is expecting a torch object.

I tested this by uploading only the sub model that I wrapped in the above Model class, which is of type transformers.models.electra.modeling_electra.ElectraModel and it worked.

Also, the above models save and load perfectly fine when ran locally. Let me know if anything needs any clarification

Doesn’t Sagemaker need 4 functions for deployment?
I wrote input_fn, output_fn, model_fn, predict_fn when I made a service.
I don’t get how did ElectraModel run successfully but would you change the name of class from Model to something different? I’m pretty not sure it helps but this is my tiny guess that the name of Model is conflicted to somewhat from Sagemaker class.

Out of Sagemaker, you don’t need to call .eval() separately.
Using entire_model.eval() is enough.

yes, I have all of the logic pertaining to the input_fn , output_fn , model_fn , predict_fn functions in the source directory. source_dir='value_to_column_source_dir' points to that logic.

I changed the name of the class to something other than Model and it still doesn’t work. I wonder if anyone else had tried deploying a PyTorch model created this way using sagemaker.

Please let me know error messages from logs

So I’m using torch.load in my model_fn function, and this is the error I’m getting in CloudWatch. Also, I renamed class Model(torch.nn.Module) to class transformer_Model(torch.nn.Module)

2022-07-18 01:58:04,499 [INFO ] W-9000-model_1-stdout org.pytorch.serve.wlm.WorkerLifeCycle - AttributeError: Can’t get attribute ‘transformer_Model’ on <module ‘main’ from ‘/opt/conda/lib/python3.6/site-packages/ts/model_service_worker.py’>

It turns out you have to save the state_dict instead of the entire model. Then when you want to use that custom model, you import the class Model(torch.nn.Module), then call Model.load_state_dict(torch.load('path')) to fill in the fine tuned weights.


1 Like

Nicely done:)
It was so hard to recognize