Hi, I was trying to explore how to train the mnist model in C++, save the model, and having another C++ to load the file and use it as inference system.
Change the Net by NetImpl as suggested, and save the mode with:
torch::save(model, "model.pt");
with both, I am able to compile and run the code and save the model, however, when I am trying to load the model from another C++ file, I get the error of ââScriptModuleâ object has no attribute âforwardââ
To simplified the testing, I tried to load the model in Python:
Load with torch.load
import torch
# load by torch.load
model = torch.load('model.pt')
## Error loading
## RuntimeError: model_impl.pt is a zip archive (did you mean to use torch.jit.load()?)
Load by torch.jit.load
# Loaded successfully, but...
model = torch.jit.load('model.pt')
model.eval()
# Python output
ScriptModule(
(conv1): ScriptModule()
(conv2): ScriptModule()
(conv2_drop): ScriptModule()
(fc1): ScriptModule()
(fc2): ScriptModule()
)
# When try to have a forward pass, I get the following error.
output = model(torch.ones(1, 1, 28, 28))
## Error : AttributeError: 'ScriptModule' object has no attribute 'forward'
Hi, Thanks for pointing this out. I thought the behavior are identical but it seems like not.
I tried the following simple lines:
#include <torch/torch.h>
struct NetImpl : torch::nn::Module {};
TORCH_MODULE(Net);
int main() {
Net model;
torch::load(model, "net.pt");
auto in = torch::rand({1, 1, 28, 28});
auto out = model->forward(in);
std::cout << in << std::endl;
std::cout << out << std::endl;
return 0;
}
the compilation gives following error:
error: âstruct NetImplâ has no member named âforwardâ
auto out = model->forward(in);
Iâve to define the NetImpl which I used for training on top of the codes to make it works, it seems like the torch::save will only save the parameters but not the network structure? Am I correct?
If so, is there anyway to save everything in C++ so that I could call it directly?
Hi everyone,
I am facing the same problem. I created simple cnn using sequential implementation (torch::nn::SequentialImpl ) i can actually use model->forward(Sometensor) but it crashes when i save the model and load it again with jit.
The root of the problem is unclear since i am getting an unhandled exception .
Hi, I saw that you have this problem solve with torch::save and torch::load, could you share your example on this?
I was using the mnist example, after saving the model using torch::save, in another C++ file, Iâve to define the same model on top of the file before I could use the torch::load.
@tancl There are a few scenarios that C++/TorchScript serialization supports:
Save as C++ model, load using torch::load() in C++
Requirement: You need to have the same C++ model class definition available when you use torch::save and torch::load. The easiest way to achieve this is to put the model class definition in a common header file.
If you want to be able to debug the model in Python, the suggested way is to define the model in Python (and perform debugging), convert it to TorchScript, and then load the model using torch::jit::load in C++ (for details on this process, see https://pytorch.org/tutorials/advanced/cpp_export.html).
These are scenarios that C++/TorchScript serialization doesnât support:
Save as C++ model using torch::save, load using torch::jit::load in C++
Save as C++ model using torch::save, load using torch.load in Python
Save as C++ model using torch::save, load using torch.jit.load in Python
My intention is to integrate libtorch with Scilab, in which the user could define their own model (in any form, different conv layers, etc) and parse in a scilab gateway (C++) and save it into a model which have the information of the network architecture. The model then will be train in next gateway C++ and save it into another model with the trained parameters again. Finally, the trained model will be called in another gateway C++ for inferencing. I am not too sure whether this could be done, as from your explanation, it goes consistence with the documentation, so i was trying my luck if there is any undocumented way or idea on how to do this.
@tancl Thanks a lot for the use case information and itâs really helpful. I am thinking of two possible options:
Option 1 - Ask the user to define their C++ model outside of Scilab and in a common header file, and then all Scilab gateways can use this header file to find the definition of the C++ model.
Option 2 - Ask the user to define their TorchScript model outside of Scilab and serialize the model, and then the Scilab gateways can load this TorchScript model and run training / inference with it.
Please let me know if any of these two options would work.
I will have C++ gateways to call the libtorch, compiled and link so that it becomes a native function in Scilab. In user end, they will just call sth like : trained_model = torch_train(data, target, model_arch, configs) .
Option 1 - Correct me if I am wrong but this will require the recompiling the codes every time a model being define in header, so It might be not flexible, unless the model could be parse as the input to the function.
Option 2 - I will explore more of this options, I remember there are some changes in 1.1 and 1.2, will try out this on 1.2 and get back to you on this. My previous test on mnist examples giving error sth like âforward method not defineâ during runtime, will confirm this again.
Again, thanks for your details reply and useful suggestions.
I was trying the option 2 by using the mnist example, my steps are:
Building/Training a model in Python-pyTorch using the python mnist example and save it into torch script using the script compiler method.
# Using the example from https://github.com/pytorch/examples/tree/master/mnist/main.py with following modification
if (args.save_model):
my_model = torch.jit.script(model)
my_model.save("mymodel.pt")
Using the model for inference works fine in C++:
#include <torch/torch.h>
#include <torch/script.h>
#include <iostream>
int main() {
//Net model;
torch::jit::script::Module model;
std::string module_path = "mymodel.pt";
model= torch::jit::load(module_path);
// Create a vector of inputs.
std::vector<torch::jit::IValue> inputs;
inputs.push_back(torch::ones({1, 1, 28, 28}));
// Execute the model and turn its output into a tensor.
at::Tensor output = model.forward(inputs).toTensor();
std::cout << output << std::endl;
return 0;
}
However, I was facing difficulty when I wanted to use the model and train in C++
// Using the example from https://github.com/pytorch/examples/blob/master/cpp/mnist/mnist.cpp, by removing the net definition block on the beginning of the codes, and loading the model previously trained in python by using jit::load:
//Net model;
//model.to(device);
torch::jit::script::Module model;
std::string module_path = "mymodel.pt";
model = torch::jit::load(module_path);
model.to(device);
auto train_dataset = torch::data::datasets::MNIST(kDataRoot)
.map(torch::data::transforms::Normalize<>(0.1307, 0.3081))
.map(torch::data::transforms::Stack<>());
const size_t train_dataset_size = train_dataset.size().value();
auto train_loader =
torch::data::make_data_loader<torch::data::samplers::SequentialSampler>(
std::move(train_dataset), kTrainBatchSize);
auto test_dataset = torch::data::datasets::MNIST(
kDataRoot, torch::data::datasets::MNIST::Mode::kTest)
.map(torch::data::transforms::Normalize<>(0.1307, 0.3081))
.map(torch::data::transforms::Stack<>());
const size_t test_dataset_size = test_dataset.size().value();
auto test_loader = torch::data::make_data_loader(std::move(test_dataset), kTestBatchSize);
torch::optim::SGD optimizer(model.parameter, torch::optim::SGDOptions(0.01).momentum(0.5));
for (size_t epoch = 1; epoch <= kNumberOfEpochs; ++epoch) {
train(epoch, model, device, *train_loader, optimizer, train_dataset_size);
test(model, device, *test_loader, test_dataset_size);
}
I get error as below:
error: âstruct torch::jit::script::Moduleâ has no member named âparameterâ; did you mean âset_parameterâ?
torch::optim::SGD optimizer(model.parameter, torch::optim::SGDOptions(0.01).momentum(0.5));
I read the torch::jit and it did mentioned on defining the nn.parameters could save the attributes, however, how to make this possible in the model definition replacing nn.Conv2d and nn.Linear?
But I got the error:
terminate called after throwing an instance of âc10::Errorâ
what(): torch::jit::load() received a file from torch.save(), but torch::jit::load() can only load files produced by torch.jit.save() (load at âŚ/torch/csrc/jit/serialization/import.cpp:285)
Why is that? And what should I do to solve the issue? Thanks in advance.
@yf225 I have similar doubt : basically how can i use pytorch trained model subsequently in torchlib for fine-tuning on c++ based device given that i have defined the model first in c++ which is obviously same as pytorch model?