pytorch_pfn_extras.nn.modules.lazy_conv.LazyInitializationMixin#
- class pytorch_pfn_extras.nn.modules.lazy_conv.LazyInitializationMixin(*args, **kwargs)#
Bases:
object
A mixin for modules that lazily initialize buffers and parameters.
Unlike regular modules, subclasses of this module can initialize buffers and parameters outside of the constructor (
__init__
). This allows you to, for example, initialize parameters inforward
method to determine the shape of the weight based on the initial input.Be sure to run “dummy” forward once to initialize all parameters that should be trained, before passing
module.parameters()
to an optimizer; otherwise weights initialized aftermodule.parameters()
(e.g., inforward
function) will never be trained.Note that lazy modules cannot validate if the shape is correct during deserialization. Also note that the initial weights may become different from the original (non-lazy) module even if the random seed is manually configured, as the order of initialization is different from the original one; especially,
module.cuda()
may cause the initialization to run on a GPU.The default value of lazy buffers and parameters are
torch.Tensor([])
andUninitializedParameter()
, respectively.Methods
__init__
(*args, **kwargs)state_dict
(*args, **kwargs)Returns a dictionary containing a whole state of the module.
Attributes
Returns if all lazy parameters are determined.
- Parameters:
args (Any) –
kwargs (Any) –
- __init__(*args, **kwargs)#
- Parameters:
self (Any) –
args (Any) –
kwargs (Any) –
- Return type:
None
- lazy_buffer_names: Tuple[str, ...] = ()#
- lazy_parameter_names: Tuple[str, ...] = ()#
- property lazy_parmeters_determined: bool#
Returns if all lazy parameters are determined.
Subclasses can perform parameters initialization after all lazy parameters are determined. Note that this may be called during
__init__
.
- state_dict(*args, **kwargs)#
Returns a dictionary containing a whole state of the module.
This function overrides the default behavior to exclude uninitialized parameter from serialization. This is needed because we need to discriminate lazy parameters (
UninitializedParameter()`) and initialized empty parameters (``torch.nn.Parameter(torch.Tensor())
) during deserialization.See comments of
_lazy_load_hook
for details.- Parameters:
self (Any) –
args (Any) –
kwargs (Any) –
- Return type:
Dict[str, Any]