(libtorch) How to save model in MNIST cpp example?

Hengd · January 9, 2019, 9:35am

I’m running mnist example and try to save trained model to disk:

torch::save(model, "model.pt")   # save model using torch::save

Then got error as:

In file included from /home/christding/env/libtorch/include/torch/csrc/api/include/torch/all.h:8:0,
                 from /home/christding/env/libtorch/include/torch/csrc/api/include/torch/torch.h:3,
                 from /home/christding/Desktop/workspace/TorchLearning/C++/mnist-custom/src/mnist.cpp:1:
/home/christding/env/libtorch/include/torch/csrc/api/include/torch/serialize.h: In instantiation of ‘void torch::save(const Value&, SaveToArgs&& ...) [with Value = Net; SaveToArgs = {const char (&)[9]}]’:
/home/christding/Desktop/workspace/TorchLearning/C++/mnist-custom/src/mnist.cpp:171:32:   required from here
/home/christding/env/libtorch/include/torch/csrc/api/include/torch/serialize.h:39:11: error: no match for ‘operator<<’ (operand types are ‘torch::serialize::OutputArchive’ and ‘const Net’)
   archive << value;
   ~~~~~~~~^~~~~~~~
... 
...

make[2]: *** [CMakeFiles/mnist.dir/src/mnist.cpp.o] Error 1
make[1]: *** [CMakeFiles/mnist.dir/all] Error 2
make: *** [all] Error 2

Does anyone know how to serialize the trained model?

Hengd · January 11, 2019, 4:56am

Solved by myself. To save the trained model, we need to create a torch::serialize::OutputArchive object like follows:

string model_path = "model.pt";
serialize::OutputArchive output_archive;
model.save(output_archive);
output_archive.save_to(model_path);

Carsten_Ditzel · January 11, 2019, 9:03am

that seems like an overcomplicated way oO

SantMan · April 5, 2019, 7:29am

Hi @Hengd,

I was trying this suggested piece of code.

Save seems to work fine. But are you able to load the saved model? I’m trying something like this

// Trying to save the model.
        std::string model_path = "test_model.pt";
        torch::serialize::OutputArchive output_archive;
        seqConvLayer->save(output_archive);
        output_archive.save_to(model_path);

// Trying to load the previously saved model
  torch::serialize::InputArchive archive;
  std::string file("test_model.pt");
  archive.load_from(file);
  torch::nn::Sequential savedSeq;
  savedSeq->load(archive);

  auto parameters = savedSeq->named_parameters();
  auto keys = parameters.keys();
  auto vals = parameters.values();
  
  for(auto v: keys) {
    std::cout << v << "\n"; 
  }         

  std::cout << "Saved Model:\n\n";
  std::cout << c10::str(savedSeq) << "\n\n";

Here is the output I’m getting.

Saved Model:

torch::nn::Sequential

Where as I’m expecting an output something similar to the one shown below.

Model:

torch::nn::Sequential(
  (0): torch::nn::Conv2d(input_channels=3, output_channels=16, kernel_size=[3, 3], stride=[1, 1])
  (1): ReLu
  (2): torch::max_pool2d(x, {2, 2})
  (3): torch::nn::Conv2d(input_channels=16, output_channels=32, kernel_size=[3, 3], stride=[1, 1])
  (4): ReLu
  (5): torch::max_pool2d(x, {2, 2})
  (6): torch::nn::Conv2d(input_channels=32, output_channels=64, kernel_size=[3, 3], stride=[1, 1])
  (7): ReLu
  (8): torch::max_pool2d(x, {2, 2})
  (9): Flatten
  (10): torch::nn::Dropout(rate=0.25)
  (11): torch::nn::Linear(in=1024, out=500, with_bias=true)
  (12): ReLu
  (13): torch::nn::Dropout(rate=0.25)
  (14): torch::nn::Linear(in=500, out=10, with_bias=true)
  (15): torch::log_softmax(x, dim=1)
)

Can you please let me know what is that I’m doing wrong while loading the model?

Thanks and Regards,
Santhosh

mhubii · April 5, 2019, 8:13am

I think, the example was written prior to the stable release of libtorch. The way you would implement the torch::nn::Module now is as follows

struct NetImpl : torch::nn::Module {       // replaced Net by NetImpl
  NetImpl()                                // replaced Net by NetImpl
      : conv1(torch::nn::Conv2dOptions(1, 10, /*kernel_size=*/5)),
        conv2(torch::nn::Conv2dOptions(10, 20, /*kernel_size=*/5)),
        fc1(320, 50),
        fc2(50, 10) {
    register_module("conv1", conv1);
    register_module("conv2", conv2);
    register_module("conv2_drop", conv2_drop);
    register_module("fc1", fc1);
    register_module("fc2", fc2);
  }

  torch::Tensor forward(torch::Tensor x) {
    x = torch::relu(torch::max_pool2d(conv1->forward(x), 2));
    x = torch::relu(
        torch::max_pool2d(conv2_drop->forward(conv2->forward(x)), 2));
    x = x.view({-1, 320});
    x = torch::relu(fc1->forward(x));
    x = torch::dropout(x, /*p=*/0.5, /*training=*/is_training());
    x = fc2->forward(x);
    return torch::log_softmax(x, /*dim=*/1);
  }

  torch::nn::Conv2d conv1;
  torch::nn::Conv2d conv2;
  torch::nn::FeatureDropout conv2_drop;
  torch::nn::Linear fc1;
  torch::nn::Linear fc2;
};

TORCH_MODULE(Net); // creates module holder for NetImpl

TORCH_MODULE(Net) creates a module holder, which is a std::shared_ptr<NetImpl>. This will enable you to call

torch::save(model, "model.pt");

and

torch::load(model, "model.pt");

You need to replace function calls on model.function() by model->function() then. This should also enable you to call model(input) instead of model.forward(input).

As can be read in the DCGAN Tutorial

For example, the serialization API ( torch::save and torch::load ) only supports module holders (or plain shared_ptr )

SantMan · April 5, 2019, 10:36am

@mhubii,

Thanks for the response. Is it really required to have a struct/class and TORCH_MODULE(Net);?? Is it not possible to store the torch::nn::Sequential directly as below?

 torch::nn::Sequential seqConvLayer(torch::nn::Conv2d(torch::nn::Conv2dOptions(3, 16, 3).padding(1)),
     ReLu(),
     MaxPool2d(),
     torch::nn::Conv2d(torch::nn::Conv2dOptions(16, 32, 3).padding(1)),
     ReLu(),
     MaxPool2d(),
     torch::nn::Conv2d(torch::nn::Conv2dOptions(32, 64, 3).padding(1)),
     ReLu(),
     MaxPool2d(),
     Flatten(),
     DropOut(0.25, true),
     torch::nn::Linear(64 * 4 * 4, 500),
     ReLu(),
     DropOut(0.25, true),
     torch::nn::Linear(500, 10),
     LogSoftMax()
     );
 TORCH_MODULE(seqConvLayer);

And it is throwing error during compilation on TORCH_MODULE(seqConvLayer);

Is there an option to save the model from sequence and load it?

mhubii · April 5, 2019, 1:39pm

Now I got your confusion. A torch::nn::Sequential already implements this for you. Go ahead and check out the implementation of it. There you will find the line

/// A `ModuleHolder` subclass for `SequentialImpl`.
/// See the documentation for `SequentialImpl` class to learn what methods it
/// provides, or the documentation for `ModuleHolder` to learn about PyTorch's
/// module storage semantics.
TORCH_MODULE(Sequential);

In other words, to save or load a Sequential, just do

// Just a simple example Sequential
auto MySequential = torch::nn::Sequential(torch::nn::Conv2d(1 /*input channels*/, 1 /*output channels*/, 2 /*kernel size*/),
					  torch::nn::Conv2d(1, 1, 2));

// Save the model
torch::save(MySequential, "model.pt");

// Load the model
torch::load(MySequential, "model.pt");

SantMan · April 5, 2019, 4:53pm

@mhubii,

Thanks again for the response. I did try as you suggested before posting my problem here. Here it is again.

 // Saving the model.
  std::string model_path = "new_test.pt";
  torch::save(seqConvLayer, model_path);

  // Loading the model.
  std::string file = "new_test.pt";
  torch::nn::Sequential savedSeq;
  torch::load(savedSeq, file);
  std::cout << "Saved Model:\n\n";
  std::cout << c10::str(savedSeq) << "\n\n";
  return 0;

With this piece of code I’m expecting that sequence model should be loade appropriately and I’m trying to verify the same by printing its content as shown in the code below.

  std::cout << "Saved Model:\n\n";
  std::cout << c10::str(savedSeq) << "\n\n";

But this doesn’t print the model at all! Here is the output I’m getting after loading the stored model.

Saved Model:

torch::nn::Sequential

Where as I’m expecting the output to look something like as shown below.

torch::nn::Sequential(
  (0): torch::nn::Conv2d(input_channels=3, output_channels=16, kernel_size=[3, 3], stride=[1, 1])
  (1): ReLu
  (2): torch::max_pool2d(x, {2, 2})
  (3): torch::nn::Conv2d(input_channels=16, output_channels=32, kernel_size=[3, 3], stride=[1, 1])
  (3): torch::nn::Conv2d(input_channels=16, output_channels=32, kernel_size=[3, 3], stride=[1, 1])
  (4): ReLu
  (5): torch::max_pool2d(x, {2, 2})
  (6): torch::nn::Conv2d(input_channels=32, output_channels=64, kernel_size=[3, 3], stride=[1, 1])
  (7): ReLu
  (8): torch::max_pool2d(x, {2, 2})
  (9): Flatten
  (10): torch::nn::Dropout(rate=0.25)
  (11): torch::nn::Linear(in=1024, out=500, with_bias=true)
  (12): ReLu
  (13): torch::nn::Dropout(rate=0.25)
  (14): torch::nn::Linear(in=500, out=10, with_bias=true)
  (15): torch::log_softmax(x, dim=1)
)

I am not sure if the model is loaded properly or not. And why loaded model is not being printed as expected ?

mhubii · April 5, 2019, 5:12pm

Why would you expect the output? This is how you access the parameters of a network

auto MySequential = Sequential(Conv2d(1 /*input channels*/, 1 /*output channels*/, 2 /*kernel size*/),
                               Conv2d(1, 1, 2));

// Save the model
torch::save(MySequential, "model.pt");

// Load the model
torch::load(MySequential, "model.pt");

for (auto& p : MySequential->named_parameters()) {

	// Access key.
	std::cout << p.key() << std::endl;

	// Access value.
	std::cout << p.value() << std::endl;
}

SantMan · April 5, 2019, 5:23pm

Hi @mhubii

I’m expecting some prints to be displayed on the console so that I will be sure the Model is loaded fine.

But I’m not getting the print for p.key() and p.value(). Which makes me think the model loaded using torch::load() , didn’t happen as expected.

If I’m not asking too much can you please tryout the MNIST example try to store and load the model model which we are discussing here? Storing is working fine. As I see that file is generated and it is around 2MB in size. But I’m facing difficulty with the torch::load(). Not able to verify if the model is loaded from stored file or not.

mhubii · April 5, 2019, 7:27pm

Most likely your problem is that you are trying to load a model into something that has not been defined correctly. You cant just call torch::nn::Sequential savedSeq;, and then try to load in a model. You need to add the same layers that your saved model had. Your model simply has no parameters, thats why savedSeq->named_parameters() is empty, and no output will be printed to your console.

The code snippet that I posted above saves the Sequential, and then loads it again. As you can see, it outputs the values correctly

SantMan · April 6, 2019, 1:42pm

@mhubii,

I got my mistake now! Thanks a ton for helping me figure it out.

Cheers,
SantMan

SantMan · April 7, 2019, 5:21am

mhubii:

struct NetImpl : torch::nn::Module { // replaced Net by NetImpl NetImpl() // replaced Net by NetImpl : conv1(torch::nn::Conv2dOptions(1, 10, /kernel_size=/5)), conv2(torch::nn::Conv2dOptions(10, 20, /kernel_size=/5)), fc1(320, 50), fc2(50, 10) { register_module(“conv1”, conv1); register_module(“conv2”, conv2); register_module(“conv2_drop”, conv2_drop); register_module(“fc1”, fc1); register_module(“fc2”, fc2); } torch::Tensor forward(torch::Tensor x) { x = torch::relu(torch::max_pool2d(conv1->forward(x), 2)); x = torch::relu( torch::max_pool2d(conv2_drop->forward(conv2->forward(x)), 2)); x = x.view({-1, 320}); x = torch::relu(fc1->forward(x)); x = torch::dropout(x, /p=/0.5, /training=/is_training()); x = fc2->forward(x); return torch::log_softmax(x, /dim=/1); } torch::nn::Conv2d conv1; torch::nn::Conv2d conv2; torch::nn::FeatureDropout conv2_drop; torch::nn::Linear fc1; torch::nn::Linear fc2; }; TORCH_MODULE(Net);

@mhubii,

I’m trying to implement a simplified version fhe code you posted above. I’m facing some issues can you please have a look at this post and let me know what is that I’m missing?