Please note **grad_output.zero_()** is in-place and so is **grad_output[:, i-1] = 0**. In-place means "modify a tensor instead of returning a new one, which has the modifications applied". An example which uses the zero out the 1st column as follows :

e.g.

import torch
t = torch.randn(3, 3)
ixs = torch.arange(3, dtype=torch.int64)
zeroed = torch.where(ixs[None, :] == 1, torch.tensor(0.), t)
zeroed
tensor([[-0.6616, 0.0000, 0.7329],
[ 0.8961, 0.0000, -0.1978],
[ 0.0798, 0.0000, -1.2041]])
t
tensor([[-0.6616, -1.6422, 0.7329],
[ 0.8961, -0.9623, -0.1978],
[ 0.0798, -0.7733, -1.2041]])

Notice how the **t** retains values it had before and also zeroed has the values which you want.