This is how I instantiate the model
val_to_col_elec_model = PyTorchModel(#model_data='s3://sagemaker-nl-filter-suggest-two/fields/pytorchmodel.tar.gz',
model_data='s3://natty-language-query/value_to_cols_model/model_.tar.gz',
role=role,
entry_point='torchserve.py',
source_dir='value_to_column_source_dir',
framework_version='1.6.0',
py_version='py36')
and this is how I deploy
nl_detector = val_to_col_elec_model.deploy(
serverless_inference_config=serverless_config,
initial_instance_count = 1,
serializer=JSONSerializer(),
deserializer=JSONDeserializer(),
instance_type = 'ml.t2.medium', endpoint_name = endpoint_name)
When I upload the model created with this code
class Model(torch.nn.Module):
def __init__(self, model, logit_layer):
super(Model, self).__init__()
self.model = model
self.model.eval()
#self.last_hidden_state_zero_layer = nn.Linear(...)
self.logit_layer = logit_layer
self.logit_layer.eval()
def forward(self, input_ids, mask, tokens):
pre_logits_mask = torch.reshape(mask, (mask.shape[0], mask.shape[1], 1) )
outputs = self.model(input_ids, attention_mask=mask, token_type_ids=tokens)
last_hidden_state = outputs.last_hidden_state
last_hidden_state_zero_layer = torch.mul(last_hidden_state, pre_logits_mask)
summed_final_hidden_state = torch.sum(last_hidden_state_zero_layer, 1)
logits = self.logit_layer(summed_final_hidden_state)
probs = torch.sigmoid(logits)
return probs
entire_model = Model(model, logit_layer)
It runs into a model not loading error. again I believe this is because it is of type main.Model, because the sagemaker container is expecting a torch object.
I tested this by uploading only the sub model that I wrapped in the above Model class, which is of type transformers.models.electra.modeling_electra.ElectraModel and it worked.
Also, the above models save and load perfectly fine when ran locally. Let me know if anything needs any clarification