How to find NaN/Inf Tensors?

Currently I’m encountering message " Warning: NaN or Inf found in input", but I can’t find where this happens.

I tried, say,

    if self.args.debug_nan:
        torch.autograd.set_detect_anomaly(True)
        np.seterr(all="raise")
    if self.args.warningonlycheck:
        import warnings
        warnings.simplefilter("error")
        warnings.simplefilter("ignore", DeprecationWarning)

and also isnan(tensor).any() but couldn’t find out where NaN/Inf occurs.
Loss doesn’t seems to explode.

Is there another way to detect them?

We’ve often seen this issue when logging things to tensorboard - you get this when trying to find the variance of a single scalar (like a one-dimensional bias weight). My guess is that’s what’s happening here, though I don’t know off the top of my head how to check this to be sure. Maybe disable tensorboard for a run and see if the warning goes away? We fixed this a few months ago (https://github.com/allenai/allennlp/issues/3116), though I don’t know if we’ve had a new release of the code since then.

1 Like

Thanks for your information. I’ll check that.