In the Neural Style transfer tutorial what do you change to use data parallelism?

In the tutorial itself the author uses
model = model.cuda()

which only utilizes 1 gpu, and when I try to change that to

model = nn.DataParallel(model)

I would get the error:

AttributeError: 'StyleLoss' object has no attribute 'loss'

In the dataparallelism tutorial it is shown you wrap around the model in nn.DataParallel like so:

 model = Model(input_size, output_size)
 if torch.cuda.device_count() > 1:
    print("Let's use", torch.cuda.device_count(), "GPUs!")
    model = nn.DataParallel(model)`
if torch.cuda.is_available():