pytorch_pfn_extras.training.extensions.FailOnNonNumber#

class pytorch_pfn_extras.training.extensions.FailOnNonNumber(*, check_grad=True)#

Bases: Extension

An extension to raise RuntimeError if parameters and its gradients contain NaN or Inf.

Although parameters including non-number such as NaN and Inf are unnecessary in most cases the training loop will continue to compute even if the parameters in a given optimizer diverge. This extension is aimed to reduce unnecessary computations by throwing RuntimeError if the parameters contain NaN or Inf.

Parameters:

check_grad (bool) – Set to False to skip checking gradients.

Methods

__init__(*[, check_grad])

finalize(manager)

Finalizes the extension.

initialize(manager)

Initializes up the manager state.

load_state_dict(to_load)

on_error(manager, exc, tb)

Handles the error raised during training before finalization.

state_dict()

Serializes the extension state.

Attributes

default_name

Default name of the extension.

is_async

name

needs_model_state

priority

trigger

__call__(manager)#

Invokes the extension.

Implementations should override this operator. This method is called at iterations which the corresponding trigger accepts.

Parameters:

manager (ExtensionsManager) – Manager object to call this operator.

Return type:

None

__init__(*, check_grad=True)#
Parameters:

check_grad (bool) –

needs_model_state = True#