Upsampling EfficientNet (in Encoder-decoder ⌛ arch)

tornikeo · February 28, 2022, 2:05pm

I’m trying to re-implement SANET, an encoder-decoder style network for arbitrary image stylization.

The problem is, these guys use 2 freakin’ VGG19s for inference. I’m trying to replace these with a much more efficient nets (pun intended) to run on mobile devices. I replaced the encoder VGG, no problems there, but what about the decoder?

Is there an equivalent, high-performance, upsampling CNN architecture that can perform just as well as inverted VGG19, but be a lot faster and smaller? (Sadly the U-nets are out question, for this particular project)

I just need some pointers, maybe related papers or something that shows me something equivalent to an inverted efficientNet architecture. Also the framework of the paper doesn’t matter.

Thank you so much.