I want to minimize 2 variables, and one is binary. It is not a probability vector for binary classification, but an input to my network who must only contain values 0 and 1. How can I achieve this? I am doing this, but I get the error `RuntimeError: a leaf Variable that requires grad is being used in an in-place operation.`

. The binary variable is `mask`

.

```
mask = torch.ones(32, 32)
mask.requires_grad = True
for epoch in range(epochs):
for d, t in dataloader:
optimizer.zero_grad()
output = net(mask*d)
loss = criterion(output, t)
loss.backward()
optimizer.step()
mask.data[mask >= mask.float().mean()] = torch.Tensor([1])
mask.data[mask < mask.float().mean()] = torch.Tensor([0])
```

I would like that before each forward pass, mask is binary (only 0 and 1 values). What is the best way to do this?

Thanks!