Verilog / Hardware acceleration for nn.conv2D

Was wondering if it is possible to replace the nn.Conv2d with a Verilog implemented function for hardware acceleration. I’m trying to run this on a Windows PC via simulation. I already have a verilog file convolver.v written. Also, if verilog is not an option, is there another FPGA or hardware accelerated implemetation to replace nn.Conv2d or nn.ReLU to increase inference time running on Windows 10 without fpga or boards/hardware?