neurotorch.modules package¶

Subpackages¶

neurotorch.modules.layers package

Submodules¶

neurotorch.modules.base module¶

Bases: NamedModule

This class is the base class of all models.

Attributes:

input_sizes: The input sizes of the model.
input_transform (torch.nn.ModuleDict): The transforms to apply to the inputs.
output_sizes: The output size of the model.
output_transform (torch.nn.ModuleDict): The transforms to apply to the outputs.
name: The name of the model.
checkpoint_folder: The folder where the checkpoints are saved.
kwargs: Additional arguments.

__call__(inputs: Dict[str, Any] | Tensor, *args, **kwargs)¶: Call self as a function.

Constructor of the BaseModel class. This class is the base class of all models.

Parameters:

input_sizes (Union[Dict[str, DimensionLike], SizeTypes]) – The input sizes of the model.
output_size (Union[Dict[str, DimensionLike], SizeTypes]) – The output size of the model.
name (str) – The name of the model.
checkpoint_folder (str) – The folder where the checkpoints are saved.
device (torch.device) – The device of the model. If None, the default device is used.
input_transform (Union[Dict[str, Callable], List[Callable]]) – The transforms to apply to the inputs. The input_transform must work batch-wise.
output_transform (Union[Dict[str, Callable], List[Callable]]) – The transforms to apply to the outputs. The output_transform must work batch-wise.

Keyword Arguments:

kwargs – Additional arguments.

apply_input_transform(inputs: Dict[str, Any]) → Dict[str, Tensor]¶

Apply the input transform to the inputs.

Parameters:: inputs (Dict[str, Any]) – dict of inputs of shape (batch_size, *input_size)
Returns:: The input of the network with the same shape as the input.
Return type:: Dict[str, torch.Tensor]

apply_output_transform(outputs: Dict[str, Any]) → Dict[str, Tensor]¶

Apply the output transform to the outputs.

Parameters:: outputs (Dict[str, Any]) – dict of outputs of shape (batch_size, *output_size).
Returns:: The output of the network transformed.
Return type:: Dict[str, torch.Tensor]

build(*args, **kwargs) → BaseModel¶

Build the network.

Parameters:

args – Not used.
kwargs – Not used.

Returns:

The network.

Return type:

BaseModel

property checkpoints_meta_path: str¶

The path to the checkpoints meta file.

Returns:: The path to the checkpoints meta file.
Return type:: str

property device: device¶

The device of the model. :rtype: torch.device

Type:: return

forward(inputs: Dict[str, Any] | Tensor, **kwargs) → Dict[str, Tensor]¶

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

get_default_input_transform() → Dict[str, Module]¶

Get the default input transform. The default input transform is a to tensor transform.

Returns:: The default input transform.
Return type:: Dict[str, nn.Module]

get_default_output_transform() → Dict[str, Module]¶

Get the default output transform. The default output transform is an identity transform.

Returns:: The default output transform.
Return type:: Dict[str, nn.Module]

get_prediction_log_proba(inputs: Tensor, re_outputs_trace: bool = True, re_hidden_states: bool = True) → Tuple[Any, Any, Any] | Tuple[Any, Any] | Any¶

Get the prediction log probabilities of the network.

Parameters:

inputs (torch.Tensor) – The inputs of the network.
re_outputs_trace (bool) – If True, the outputs trace will be returned.
re_hidden_states (bool) – If True, the hidden states will be returned.

Returns:

The prediction log probabilities.

Return type:

Union[Tuple[Any, Any, Any], Tuple[Any, Any], Any]

get_prediction_proba(inputs: Tensor, re_outputs_trace: bool = True, re_hidden_states: bool = True) → Tuple[Any, Any, Any] | Tuple[Any, Any] | Any¶

Get the prediction probabilities of the network.

Parameters:

inputs (torch.Tensor) – The inputs of the network.
re_outputs_trace (bool) – If True, the outputs trace will be returned.
re_hidden_states (bool) – If True, the hidden states will be returned.

Returns:

The prediction probabilities.

Return type:

Union[Tuple[Any, Any, Any], Tuple[Any, Any], Any]

get_prediction_trace(inputs: Dict[str, Any] | Tensor, **kwargs) → Dict[str, Tensor] | Tensor¶

Get the prediction trace of the network.

Parameters:

inputs (Union[Dict[str, Any], torch.Tensor]) – The inputs of the network.
kwargs – Additional arguments.

Returns:

The prediction trace.

Return type:

Union[Dict[str, torch.Tensor], torch.Tensor]

get_raw_prediction(inputs: Tensor, re_outputs_trace: bool = True, re_hidden_states: bool = True) → Tuple[Any, Any, Any] | Tuple[Any, Any] | Any¶

The raw prediction of the network.

Parameters:

inputs (torch.Tensor) – The inputs of the network.
re_outputs_trace (bool) – If True, the outputs trace will be returned.
re_hidden_states (bool) – If True, the hidden states will be returned.

Returns:

The raw prediction.

Return type:

Union[Tuple[Any, Any, Any], Tuple[Any, Any], Any]

hard_update(other: BaseModel) → None¶

Copies the weights from the other network to this network.

Parameters:: other ('BaseModel') – The other network.
Returns:: None

infer_sizes_from_inputs(inputs: Dict[str, Any] | Tensor)¶

Infer the input and output sizes from the inputs.

Parameters:: inputs (Union[Dict[str, Any], torch.Tensor]) – The inputs of the network.
Returns:: None

property input_sizes: Dict[str, int]¶

property is_built: bool¶

load_checkpoint(checkpoints_meta_path: str | None = None, load_checkpoint_mode: LoadCheckpointMode = LoadCheckpointMode.BEST_ITR, verbose: bool = True) → dict¶

Load the checkpoint from the checkpoints_meta_path. If the checkpoints_meta_path is None, the default checkpoints_meta_path is used.

Parameters:

checkpoints_meta_path (Optional[str]) – The path to the checkpoints meta file.
load_checkpoint_mode (LoadCheckpointMode) – The mode to use when loading the checkpoint.
verbose (bool) – Whether to print the loaded checkpoint information.

Returns:

The loaded checkpoint information.

Return type:

dict

property output_sizes: Dict[str, int]¶

soft_update(other: BaseModel, tau: float = 0.01) → None¶

Copies the weights from the other network to this network with a factor of tau.

Parameters:

other ('BaseModel') – The other network.
tau (float) – The factor of the copy.

Returns:

None

to(device: device, non_blocking: bool = True, *args, **kwargs)¶

Moves and/or casts the parameters and buffers.

This can be called as

to(device=None, dtype=None, non_blocking=False)

to(dtype, non_blocking=False)

to(tensor, non_blocking=False)

to(memory_format=torch.channels_last)

Its signature is similar to torch.Tensor.to(), but only accepts floating point or complex dtypes. In addition, this method will only cast the floating point or complex parameters and buffers to dtype (if given). The integral parameters and buffers will be moved device, if that is given, but with dtypes unchanged. When non_blocking is set, it tries to convert/move asynchronously with respect to the host if possible, e.g., moving CPU Tensors with pinned memory to CUDA devices.

See below for examples.

Note

This method modifies the module in-place.

Parameters:

device (torch.device) – the desired device of the parameters and buffers in this module
dtype (torch.dtype) – the desired floating point or complex dtype of the parameters and buffers in this module
tensor (torch.Tensor) – Tensor whose dtype and device are the desired dtype and device for all parameters and buffers in this module
memory_format (torch.memory_format) – the desired memory format for 4D parameters and buffers in this module (keyword only argument)

Returns:

self

Return type:

Module

Examples:

>>> # xdoctest: +IGNORE_WANT("non-deterministic")
>>> linear = nn.Linear(2, 2)
>>> linear.weight
Parameter containing:
tensor([[ 0.1913, -0.3420],
        [-0.5113, -0.2325]])
>>> linear.to(torch.double)
Linear(in_features=2, out_features=2, bias=True)
>>> linear.weight
Parameter containing:
tensor([[ 0.1913, -0.3420],
        [-0.5113, -0.2325]], dtype=torch.float64)
>>> # xdoctest: +REQUIRES(env:TORCH_DOCTEST_CUDA1)
>>> gpu1 = torch.device("cuda:1")
>>> linear.to(gpu1, dtype=torch.half, non_blocking=True)
Linear(in_features=2, out_features=2, bias=True)
>>> linear.weight
Parameter containing:
tensor([[ 0.1914, -0.3420],
        [-0.5112, -0.2324]], dtype=torch.float16, device='cuda:1')
>>> cpu = torch.device("cpu")
>>> linear.to(cpu)
Linear(in_features=2, out_features=2, bias=True)
>>> linear.weight
Parameter containing:
tensor([[ 0.1914, -0.3420],
        [-0.5112, -0.2324]], dtype=torch.float16)

>>> linear = nn.Linear(2, 2, bias=None).to(torch.cdouble)
>>> linear.weight
Parameter containing:
tensor([[ 0.3741+0.j,  0.2382+0.j],
        [ 0.5593+0.j, -0.4443+0.j]], dtype=torch.complex128)
>>> linear(torch.ones(3, 2, dtype=torch.cdouble))
tensor([[0.6122+0.j, 0.1150+0.j],
        [0.6122+0.j, 0.1150+0.j],
        [0.6122+0.j, 0.1150+0.j]], dtype=torch.complex128)

to_onnx(in_viz=None)¶

Creates an ONNX model from the network.

Parameters:: in_viz (Any) – The input to visualize.
Returns:: The ONNX model.

training: bool¶

class neurotorch.modules.base.NamedModule(name: str | None = None)¶

Bases: Module

__init__(name: str | None = None)¶: Initializes internal Module state, shared by both nn.Module and ScriptModule.

property name: str¶

Returns the name of the module. If the name is not set, it will be set to the class name.

Returns:: The name of the module.

property name_is_set: bool¶

Returns whether the name of the module has been set.

Returns:: Whether the name of the module has been set.

training: bool¶

Bases: NamedModule

__init__(input_size: int | Dimension | Iterable[int | Dimension] | Size | None = None, output_size: int | Dimension | Iterable[int | Dimension] | Size | None = None, name: str | None = None)¶: Initializes internal Module state, shared by both nn.Module and ScriptModule.

property input_size: Dimension | None¶

property output_size: Dimension | None¶

training: bool¶

neurotorch.modules.functions module¶

class neurotorch.modules.functions.PSigmoid(p: float = 1.0, learn_p: bool = True)¶

Bases: Module

Applies the Pseudo-Sigmoid function element-wise.

Pseudo-Sigmoid is defined as:

\[\begin{split}\\text{PSigmoid}(x) = \\frac{1}{1 + \\exp(- p \\odot x)}\end{split}\]

__init__(p: float = 1.0, learn_p: bool = True)¶: Initializes internal Module state, shared by both nn.Module and ScriptModule.

forward(x)¶

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

property p¶

training: bool¶

class neurotorch.modules.functions.WeirdTanh(a: float = 1.0, b: float = 1.0, c: float = 1.0, d: float = 1.0, alpha: float = 1.0, beta: float = 1.0, gamma: float = 1.0, delta: float = 1.0)¶

Bases: Module

Applies the Hyperbolic Tangent (Tanh) function element-wise.

Tanh is defined as:

\[\text{Tanh}(x) = \tanh(x) = \frac{a\exp(\alpha x) - b\exp(-\beta x)} {c\exp(\gamma x) + d\exp(-\delta x)}\]

__init__(a: float = 1.0, b: float = 1.0, c: float = 1.0, d: float = 1.0, alpha: float = 1.0, beta: float = 1.0, gamma: float = 1.0, delta: float = 1.0)¶: Initializes internal Module state, shared by both nn.Module and ScriptModule.

forward(x)¶

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

training: bool¶

neurotorch.modules.sequential module¶

class neurotorch.modules.sequential.Sequential(layers: Iterable[Iterable[Module] | Module], name: str = 'Sequential', checkpoint_folder: str = 'checkpoints', device: device | None = None, input_transform: Dict[str, Callable] | List[Callable] | None = None, output_transform: Dict[str, Callable] | List[Callable] | None = None, **kwargs)¶

Bases: BaseModel

The Sequential is a neural network that is constructed by stacking layers.

../../images/modules/Sequential_model_schm.png

Attributes:

input_layers (torch.nn.ModuleDict): The input layers of the model.
hidden_layers (torch.nn.ModuleList): The hidden layers of the model.
output_layers (torch.nn.ModuleDict): The output layers of the model.
foresight_time_steps (int): The number of time steps that the model will forecast.

__init__(layers: Iterable[Iterable[Module] | Module], name: str = 'Sequential', checkpoint_folder: str = 'checkpoints', device: device | None = None, input_transform: Dict[str, Callable] | List[Callable] | None = None, output_transform: Dict[str, Callable] | List[Callable] | None = None, **kwargs)¶

The Sequential is a neural network that is constructed by stacking layers.

Parameters:: layers – The layers to be used in the model. One of the following structure is expected:

layers = [
        [*inputs_layers, ],
        *hidden_layers,
        [*output_layers, ]
]
or
layers = [
        input_layer,
        *hidden_layers,
        output_layer
]

Parameters:

name (str) – The name of the model.
checkpoint_folder (str) – The folder where the checkpoints are saved.
device (torch.device) – The device to use.
input_transform (Union[Dict[str, Callable], List[Callable]]) – The transform to apply to the input. The input_transform must work on a single datum.
output_transform (Union[Dict[str, Callable], List[Callable]]) – The transform to apply to the output trace. The output_transform must work batch-wise.
kwargs – Additional keyword arguments.

build() → Sequential¶

Build the network and all its layers.

Returns:: The network.
Return type:: Sequential

build_layers()¶

Build the layers of the model.

Returns:: None

property device: device¶

The device of the model. :rtype: torch.device

Type:: return

forward(inputs: Dict[str, Any] | Tensor, **kwargs) → Dict[str, Tensor]¶

Forward pass of the model.

Parameters:

inputs (Union[Dict[str, Any], torch.Tensor]) – The inputs to the model where the dimensions are {input_name: (batch_size, input_size)}.
kwargs – Additional arguments for the forward pass.

Returns:

A dictionary of outputs where the values are the names of the layers and the values are the outputs of the layers.

Return type:

Dict[str, torch.Tensor]

get_all_layers() → List[Module]¶

Get all the layers of the model as a list. The order of the layers is the same as the order of the layers in the model.

Returns:: A list of all the layers of the model.
Return type:: List[nn.Module]

get_all_layers_names() → List[str]¶

Get all the names of the layers of the model. The order of the layers is the same as the order of the layers in the model.

Returns:: A list of all the names of the layers of the model.
Return type:: List[str]

get_and_reset_regularization_loss() → Tensor¶

Get the regularization loss as a sum of all the regularization losses of the layers. Then reset the regularization losses.

Returns:: the regularization loss.
Return type:: torch.Tensor

get_dict_of_layers() → Dict[str, Module]¶

Get all the layers of the model as a dictionary. The order of the layers is the same as the order of the layers in the model. The keys of the dictionary are the names of the layers.

Returns:: A dictionary of all the layers of the model.
Return type:: Dict[str, nn.Module]

get_layer(name: str | None = None) → Module¶

Get a layer of the model. If the name is None, the first layer is returned which is useful when the model has only one layer.

Parameters:: name (str) – The name of the layer.
Returns:: The layer with the given name. If the name is None, the first layer is returned.
Return type:: nn.Module

get_layers(layer_names: List[str] | None = None) → List[Module]¶

Get the layers with the specified names.

Parameters:: layer_names (Optional[List[str]]) – The names of the layers to get.
Returns:: The layers with the specified names.
Return type:: List[nn.Module]

get_prediction_log_proba(inputs: Tensor, **kwargs) → Tuple[Tensor, Any, Any] | Tuple[Tensor, Any] | Tensor¶

Get the prediction log probability of the model which is the log softmax of the output of the forward pass. The log softmax is performed on the last dimension. This method is generally used for training in classification task.

Parameters:: inputs (torch.Tensor) – inputs to the network.
Returns:: the prediction log probability of the model.
Return type:: Union[tuple[Tensor, Any, Any], tuple[Tensor, Any], Tensor]

get_prediction_proba(inputs: Tensor, **kwargs) → Any¶

Get the prediction probability of the model which is the softmax of the output of the forward pass. The softmax is performed on the last dimension. This method is generally used for classification.

Parameters:: inputs (torch.Tensor) – inputs to the network.
Returns:: the prediction probability of the model.
Return type:: Union[Tuple[Any, Any, Any], Tuple[Any, Any], Any]

get_raw_prediction(inputs: Tensor, **kwargs) → Any¶

Get the raw prediction of the model which is the output of the forward pass.

Parameters:: inputs (torch.Tensor) – inputs to the network.
Returns:: the raw prediction of the model.
Return type:: Any

infer_sizes_from_inputs(inputs: Dict[str, Any] | Tensor)¶

Infer the sizes of the inputs layers from the inputs of the network. The sizes of the inputs layers are set to the size of the inputs without the batch dimension.

Parameters:: inputs (Union[Dict[str, Any], torch.Tensor]) – The inputs of the network.
Returns:: None

initialize_weights_()¶

Initialize the weights of the layers of the model.

Returns:: None

input_transform: torch.nn.ModuleDict¶

output_transform: torch.nn.ModuleDict¶

training: bool¶

neurotorch.modules.sequential_rnn module¶

class neurotorch.modules.sequential_rnn.SequentialRNN(layers: Iterable[Iterable[BaseLayer] | BaseLayer], foresight_time_steps: int = 0, name: str = 'SequentialRNN', checkpoint_folder: str = 'checkpoints', device: device | None = None, input_transform: Dict[str, Callable] | List[Callable] | None = None, output_transform: Dict[str, Callable] | List[Callable] | None = None, **kwargs)¶

Bases: Sequential

The SequentialRNN is a neural network that is constructed by stacking layers.

Attributes:

input_layers (torch.nn.ModuleDict): The input layers of the model.
hidden_layers (torch.nn.ModuleList): The hidden layers of the model.
output_layers (torch.nn.ModuleDict): The output layers of the model.
foresight_time_steps (int): The number of time steps that the model will forecast.

__init__(layers: Iterable[Iterable[BaseLayer] | BaseLayer], foresight_time_steps: int = 0, name: str = 'SequentialRNN', checkpoint_folder: str = 'checkpoints', device: device | None = None, input_transform: Dict[str, Callable] | List[Callable] | None = None, output_transform: Dict[str, Callable] | List[Callable] | None = None, **kwargs)¶

The SequentialModel is a neural network that is constructed by stacking layers.

Parameters:: layers – The layers to be used in the model. One of the following structure is expected:

layers = [
        [*inputs_layers, ],
        *hidden_layers,
        [*output_layers, ]
]
or
layers = [
        input_layer,
        *hidden_layers,
        output_layer
]

Parameters:

foresight_time_steps (int) – The number of time steps to predict in the future. When multiple inputs or outputs are given, the outputs of the network are given to the inputs in the same order as they were specified in the construction of the network. In other words, the first output is given to the first input, the second output is given to the second input, and so on. If there are fewer outputs than inputs, the last inputs are not considered as recurrent inputs, so they are not fed.
name (str) – The name of the model.
checkpoint_folder (str) – The folder where the checkpoints are saved.
device (torch.device) – The device to use.
input_transform (Union[Dict[str, Callable], List[Callable]]) – The transform to apply to the input. The input_transform must work on a single datum.
output_transform (Union[Dict[str, Callable], List[Callable]]) – The transform to apply to the output trace. The output_transform must work batch-wise.
kwargs – Additional keyword arguments.

Keyword Arguments:

out_memory_size (int) – The size of the memory buffer for the output trace. The output of each layer is stored in the memory buffer. If the memory_size is not specified, the memory_size is set to foresight_time_steps if specified, otherwise to is set to infinity. Reduce this number to 1 if you want to use less memory and if you don’t need the intermediate outputs. Default is foresight_time_steps if specified, otherwise inf.
hh_memory_size (int) – The size of the memory buffer for the hidden state. The hidden state of each layer is stored in the memory buffer. If the memory_size is not specified, the memory_size is set to foresight_time_steps if specified, otherwise to is set infinity. Reduce this number to 1 if you want to use less memory and if you don’t need the intermediate hidden states. Default is foresight_time_steps if specified, otherwise inf.
memory_device (Optional[torch.device]) – The device to use for the memory buffer. If not specified, the memory_device is set to the device of the model. To use less cuda memory, you can set the memory_device to cpu. However, this will slow down the computation.

build() → SequentialRNN¶

Build the network and all its layers.

Returns:: The network.
Return type:: SequentialRNN

forward(inputs: Dict[str, Any] | Tensor, **kwargs) → Tuple[Dict[str, Tensor], Dict[str, Tuple[Tensor, ...]]]¶

Forward pass of the model.

When it comes to integrate a time series:

We integrate the initial conditions <time_step> times.
We predict the remaining <forward_sight_time_steps - 1> time steps from the initial conditions
Please note that the last output of the integration of the initial conditions is the input for

the integration of the remaining time steps AND also the first prediction.

Example:
        time_series = [t_0, t_1 ... t_N] if:
        [t_0, t_1] -> Initial conditions, then t_1 generate the first prediction (t_2) :
        [t_2, t_3 ... t_N] -> The remaining time steps are predicted from the initial conditions.

Parameters:

inputs (Union[Dict[str, Any], torch.Tensor]) – The inputs to the model where the dimensions are {input_name: (batch_size, time_steps, input_size)}. If the inputs have the shape (batch_size, input_size), then the time_steps is 1. All the inputs must have the same time_steps otherwise the inputs with lower time_steps will be padded with zeros.
kwargs – Additional arguments for the forward pass.

Keyword Arguments:

foresight_time_steps (int) – The number of time steps to forecast. Default: The value of the attribute :foresight_time_steps.

Returns:

A tuple of two dictionaries. The first dictionary contains the outputs of the model and the second dictionary contains the hidden states of the model. The keys of the dictionaries are the names of the layers. The values of the dictionaries are lists of tensors. The length of the lists is the number of time steps.

Return type:

Tuple[Dict[str, torch.Tensor], Dict[str, Tuple[torch.Tensor, …]]]

get_and_reset_regularization_loss() → Tensor¶

Get the regularization loss as a sum of all the regularization losses of the layers. Then reset the regularization losses.

Returns:: the regularization loss.
Return type:: torch.Tensor

get_fmt_prediction(inputs: ~torch.Tensor, lambda_func: ~typing.Callable[[~torch.Tensor], ~torch.Tensor] = <function SequentialRNN.<lambda>>, re_outputs_trace: bool = True, re_hidden_states: bool = True) → Tuple[Any, Any, Any] | Tuple[Any, Any] | Any¶

Get the prediction of the model which is the output of the forward pass and apply the max operation on the time dimension.

Parameters:

inputs (torch.Tensor) – inputs to the network.
lambda_func – the function to apply on the output trace. Default is get the last item on the time.
re_outputs_trace (bool) – Whether to return the outputs trace. Default is True.
re_hidden_states (bool) – Whether to return the hidden states. Default is True.

Returns:

the max prediction of the model.

Return type:

Union[Tuple[Any, Any, Any], Tuple[Any, Any], Any]

get_last_prediction(inputs: Tensor, re_outputs_trace: bool = True, re_hidden_states: bool = True) → Tuple[Any, Any, Any] | Tuple[Any, Any] | Any¶

Get the prediction of the model which is the output of the forward pass and get the last item on the time dimension.

Parameters:

inputs (torch.Tensor) – inputs to the network.
re_outputs_trace (bool) – Whether to return the outputs trace. Default is True.
re_hidden_states (bool) – Whether to return the hidden states. Default is True.

Returns:

the last prediction of the model.

Return type:

Union[Tuple[Any, Any, Any], Tuple[Any, Any], Any]

get_max_prediction(inputs: Tensor, re_outputs_trace: bool = True, re_hidden_states: bool = True) → Tuple[Any, Any, Any] | Tuple[Any, Any] | Any¶

Get the prediction of the model which is the output of the forward pass and apply the max operation on the time dimension.

Parameters:

inputs (torch.Tensor) – inputs to the network.
re_outputs_trace (bool) – Whether to return the outputs trace. Default is True.
re_hidden_states (bool) – Whether to return the hidden states. Default is True.

Returns:

the max prediction of the model.

Return type:

Union[Tuple[Any, Any, Any], Tuple[Any, Any], Any]

get_mean_prediction(inputs: Tensor, re_outputs_trace: bool = True, re_hidden_states: bool = True) → Tuple[Any, Any, Any] | Tuple[Any, Any] | Any¶

Get the prediction of the model which is the output of the forward pass and apply the mean operation on the time dimension.

Parameters:

inputs (torch.Tensor) – inputs to the network.
re_outputs_trace (bool) – Whether to return the outputs trace. Default is True.
re_hidden_states (bool) – Whether to return the hidden states. Default is True.

Returns:

the mean prediction of the model.

Return type:

Union[Tuple[Any, Any, Any], Tuple[Any, Any], Any]

get_prediction_log_proba(inputs: Tensor, re_outputs_trace: bool = True, re_hidden_states: bool = True) → Tuple[Tensor, Any, Any] | Tuple[Tensor, Any] | Tensor¶

Get the prediction log probability of the model which is the log softmax of the output of the forward pass. The log softmax is performed on the time dimension. This method is generally used for training in classification task.

Parameters:

inputs (torch.Tensor) – inputs to the network.
re_outputs_trace (bool) – Whether to return the outputs trace. Default is True.
re_hidden_states (bool) – Whether to return the hidden states. Default is True.

Returns:

the prediction log probability of the model.

Return type:

Union[tuple[Tensor, Any, Any], tuple[Tensor, Any], Tensor]

get_prediction_proba(inputs: Tensor, re_outputs_trace: bool = True, re_hidden_states: bool = True) → Tuple[Any, Any, Any] | Tuple[Any, Any] | Any¶

Get the prediction probability of the model which is the softmax of the output of the forward pass. The softmax is performed on the time dimension. This method is generally used for classification.

Parameters:

inputs (torch.Tensor) – inputs to the network.
re_outputs_trace (bool) – Whether to return the outputs trace. Default is True.
re_hidden_states (bool) – Whether to return the hidden states. Default is True.

Returns:

the prediction probability of the model.

Return type:

Union[Tuple[Any, Any, Any], Tuple[Any, Any], Any]

get_prediction_trace(inputs: Dict[str, Any] | Tensor, **kwargs) → Dict[str, Tensor] | Tensor | Tuple[Tensor, ...]¶

Returns the prediction trace for the given inputs. Method used for time series prediction.

Parameters:

inputs (Union[Dict[str, Any], torch.Tensor]) – inputs to the network.
kwargs – kwargs to be passed to the forward method.

Keyword Arguments:

foresight_time_steps (int) –
number of time steps to predict. Default is self.foresight_time_steps.

:: Note: If the value of foresight_time_steps is specified, make sure that the values of the attributes
out_memory_size and hh_memory_size are correctly set.
return_hidden_states (bool) – if True, returns the hidden states of the model. Default is False.
trunc_time_steps (int) – number of time steps to truncate the prediction trace. Default is None.

Returns:

the prediction trace.

Return type:

Union[Dict[str, torch.Tensor], torch.Tensor, Tuple[torch.Tensor, …]]

get_raw_prediction(inputs: Tensor, re_outputs_trace: bool = True, re_hidden_states: bool = True) → Tuple[Any, Any] | Any¶

Get the raw prediction of the model which is the output of the forward pass.

Parameters:

inputs (torch.Tensor) – inputs to the network.
re_outputs_trace (bool) – Whether to return the outputs trace. Default is True.
re_hidden_states (bool) – Whether to return the hidden states. Default is True.

Returns:

the raw prediction of the model.

Return type:

Union[Tuple[Any, Any], Any]

property hh_memory_size: int¶

Get the size of the hidden state memory buffer.

Returns:: The size of the hidden state memory buffer.
Return type:: int

input_transform: torch.nn.ModuleDict¶

property out_memory_size: int¶

Get the size of the output memory buffer.

Returns:: The size of the output memory buffer.
Return type:: int

output_transform: torch.nn.ModuleDict¶

training: bool¶

neurotorch.modules.spike_funcs module¶

class neurotorch.modules.spike_funcs.HeavisidePhiApprox(*args, **kwargs)¶

Bases: SpikeFunction

Implementation of the spike function. The spike function is a differentiable version of the Heaviside function. The Heaviside function is defined in the doc of SpikeFunction. The backward pass of this function is the approximation of the heaviside used in Bellec et al. [BSS+20]. This approximation is defined in equation (1).

(1)¶\[\begin{equation} \psi_j^t = \frac{\gamma_\text{pd}}{v_{\text{th}}} \text{max}\left(0, 1 - \left\vert\frac{v_j^t - A_j^t}{v_\text{th}}\right\vert\right) \end{equation}\]

[ABMIZA]

missing journal in al-batah_modified_2010

[BSS+20]

Guillaume Bellec, Franz Scherr, Anand Subramoney, Elias Hajek, Darjan Salaj, Robert Legenstein, and Wolfgang Maass. A solution to the learning dilemma for recurrent networks of spiking neurons. Nature Communications, 11(1):3625, 2020. URL: https://www.nature.com/articles/s41467-020-17236-y (visited on 2021-12-18), doi:10.1038/s41467-020-17236-y.

[Gro88]

Stephen Grossberg. Nonlinear neural networks: principles, mechanisms, and architectures. Neural Networks, 1(1):17–61, 1988. URL: https://www.sciencedirect.com/science/article/pii/0893608088900214, doi:https://doi.org/10.1016/0893-6080(88)90021-4.

[Izh07]

Eugene M. Izhikevich. Dynamical Systems in Neuroscience. MIT Press, 2007. ISBN 978-0-262-09043-8.

[NMZ19]

Emre O. Neftci, Hesham Mostafa, and Friedemann Zenke. Surrogate gradient learning in spiking neural networks: bringing the power of gradient-based optimization to spiking neural networks. IEEE Signal Processing Magazine, 36(6):51–63, 2019. Conference Name: IEEE Signal Processing Magazine. doi:10.1109/MSP.2019.2931595.

[PDD22]

Vincent Painchaud, Nicolas Doyon, and Patrick Desrosiers. Beyond wilson-cowan dynamics: oscillations and chaos without inhibition. 2022. URL: https://arxiv.org/abs/2204.00583, doi:10.48550/ARXIV.2204.00583.

[PAS+]

Matthew G. Perich, Charlotte Arlt, Sofia Soares, Megan E. Young, Clayton P. Mosher, Juri Minxha, Eugene Carter, Ueli Rutishauser, Peter H. Rudebeck, Christopher D. Harvey, and Kanaka Rajan. Inferring brain-wide interactions using data-constrained recurrent neural network models. Pages: 2020.12.18.423348 Section: New Results. URL: https://www.biorxiv.org/content/10.1101/2020.12.18.423348v2 (visited on 2022-11-06), doi:10.1101/2020.12.18.423348.

[VRA05]

Tim P. Vogels, Kanaka Rajan, and L.F. Abbott. Neural network dynamics. Annual Review of Neuroscience, 28(1):357–376, 2005. PMID: 16022600. URL: https://doi.org/10.1146/annurev.neuro.28.061604.135637, arXiv:https://doi.org/10.1146/annurev.neuro.28.061604.135637, doi:10.1146/annurev.neuro.28.061604.135637.

[WC72]

Hugh R Wilson and Jack D Cowan. Excitatory and inhibitory interactions in localized populations of model neurons. Biophysical journal, 12(1):1–24, 1972.

[ZSZ+]

Chunyuan Zhang, Qi Song, Hui Zhou, Yigui Ou, Hongyao Deng, and Laurence Tianruo Yang. Revisiting recursive least squares for training deep neural networks. URL: http://arxiv.org/abs/2109.03220 (visited on 2022-11-06), arXiv:2109.03220 [cs].

static backward(ctx: FunctionCtx, grad_outputs)¶

The implementation of the equation (1).

Parameters:

ctx (torch.autograd.function.FunctionCtx) – The context of the function. It is used to retrieve information from the forward pass.
grad_outputs (torch.Tensor) – The gradient of the loss with respect to the output of the forward pass.

Returns:

The gradient of the loss with respect to the input of the forward pass.

epsilon = 1e-05¶

static pseudo_derivative(inputs, threshold, gamma)¶

class neurotorch.modules.spike_funcs.HeavisideSigmoidApprox(*args, **kwargs)¶

Bases: SpikeFunction

Implementation of the spike function. The spike function is a differentiable version of the Heaviside function. The Heaviside function is defined in the doc of SpikeFunction. The backward pass of this function is the first derivative of the fast sigmoid function defined in equation (2). The derivative is shown in equation (3) used in Zenke & Ganguli (2018).

(2)¶\[\begin{equation} S(x) = \frac{1}{1 + e^{-x}} \end{equation}\]

(3)¶\[\begin{equation} S'(x) \approx \frac{x}{\left(1 + \gamma\vert{x - thr}\vert\right)^2} \end{equation}\]

static backward(ctx: FunctionCtx, grad_outputs: Tensor) → Any¶

The implementation of the equation (3).

Parameters:

ctx (torch.autograd.function.FunctionCtx) – The context of the function. It is used to retrieve information from the forward pass.
grad_outputs (torch.Tensor) – The gradient of the loss with respect to the output of the forward pass.

Returns:

The gradient of the loss with respect to the input of the forward pass.

class neurotorch.modules.spike_funcs.SpikeFuncType(value)¶

Bases: Enum

An enumeration.

FastSigmoid = 0¶

Phi = 1¶

class neurotorch.modules.spike_funcs.SpikeFunction(*args, **kwargs)¶

Bases: Function

Implementation of the spike function. The spike function is a differentiable version of the Heaviside function. The Heaviside function is defined as the equation (4). The backward pass of this function has to be an approximation of the derivative of the Heaviside function.

(4)¶\[\begin{split}\begin{equation} H(x, thr) = \left\{ \begin{matrix} 1 & \text{ if } x > thr; \\ 0 & \text{ else}. \end{matrix} \right. \end{equation}\end{split}\]

static backward(ctx: FunctionCtx, grad_outputs)¶: In the backward pass we receive a Tensor we need to compute the surrogate gradient of the loss with respect to the input. Here we use the normalized negative part of a fast sigmoid as this was done in Zenke & Ganguli (2018).

static forward(ctx: FunctionCtx, inputs: Tensor, threshold: Tensor = tensor(1.), gamma: Tensor = tensor(1.)) → Tensor¶

The forward pass of the spike function is the Heaviside function. See the heaviside equation.

Parameters:

ctx (torch.autograd.function.FunctionCtx) – The context of the function. It is used to store information for the backward pass. Use the method ctx.save_for_backward() to store information.
inputs (torch.Tensor) – The input tensor.
threshold (torch.Tensor) – The threshold of the spike function.
gamma (torch.Tensor) – The gamma parameter of the spike function. This parameter is used in the backward pass to increase the gradient of the spike function. See child classes for more information.

Returns:

The output of the spike function.

Return type:

torch.Tensor

neurotorch.modules.utils module¶

class neurotorch.modules.utils.DimensionsCat(start_axis: int, end_axis: int = -1)¶

Bases: object

__call__(inputs: Tensor)¶: Call self as a function.

__init__(start_axis: int, end_axis: int = -1)¶

property axes¶

neurotorch.modules.wrappers module¶

class neurotorch.modules.wrappers.NamedModuleWrapper(module: Module, name: str | None = None)¶

Bases: NamedModule

Wrapper for a module that does not inherit from NamedModule.

__init__(module: Module, name: str | None = None)¶: Initializes internal Module state, shared by both nn.Module and ScriptModule.

forward(*args, **kwargs)¶

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

training: bool¶

class neurotorch.modules.wrappers.SizedModuleWrapper(module: Module, input_size: int | None = None, output_size: int | None = None, name: str | None = None)¶

Bases: SizedModule

Wrapper for a module that does not inherit from SizedModule.

__init__(module: Module, input_size: int | None = None, output_size: int | None = None, name: str | None = None)¶: Initializes internal Module state, shared by both nn.Module and ScriptModule.

forward(*args, **kwargs)¶

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

infer_input_size()¶

infer_output_size()¶

training: bool¶

neurotorch.modules package¶

Subpackages¶

Submodules¶

neurotorch.modules.base module¶

neurotorch.modules.functions module¶

neurotorch.modules.sequential module¶

neurotorch.modules.sequential_rnn module¶

neurotorch.modules.spike_funcs module¶

neurotorch.modules.utils module¶

neurotorch.modules.wrappers module¶

Module contents¶