speechbrain.nnet.quaternion_networks.q_RNN module

Library implementing quaternion-valued recurrent neural networks.

Authors

Titouan Parcollet 2020

Summary

Classes:

`QLSTM`	This function implements a quaternion-valued LSTM as first introduced in : "Quaternion Recurrent Neural Networks", Parcollet T.
`QLSTM_Layer`	This function implements quaternion-valued LSTM layer.
`QLiGRU`	This function implements a quaternion-valued Light GRU (liGRU).
`QLiGRU_Layer`	This function implements quaternion-valued Light-Gated Recurrent Units (ligru) layer.
`QRNN`	This function implements a vanilla quaternion-valued RNN.
`QRNN_Layer`	This function implements quaternion-valued recurrent layer.

Reference

class speechbrain.nnet.quaternion_networks.q_RNN.QLSTM(hidden_size, input_shape, num_layers=1, bias=True, dropout=0.0, bidirectional=False, init_criterion='glorot', weight_init='quaternion', autograd=True)[source]

Bases: Module

This function implements a quaternion-valued LSTM as first introduced in : “Quaternion Recurrent Neural Networks”, Parcollet T. et al.

Input format is (batch, time, fea) or (batch, time, fea, channel). In the latter shape, the two last dimensions will be merged: (batch, time, fea * channel)

Parameters:

hidden_size (int) – Number of output neurons (i.e, the dimensionality of the output). Specified value is in terms of quaternion-valued neurons. Thus, the output is 4*hidden_size.
input_shape (tuple) – The expected shape of the input tensor.
num_layers (int, optional) – Number of layers to employ in the RNN architecture (default 1).
bias (bool, optional) – If True, the additive bias b is adopted (default True).
dropout (float, optional) – It is the dropout factor (must be between 0 and 1) (default 0.0).
bidirectional (bool, optional) – If True, a bidirectional model that scans the sequence both right-to-left and left-to-right is used (default False).
init_criterion (str , optional) – (glorot, he). This parameter controls the initialization criterion of the weights. It is combined with weights_init to build the initialization method of the quaternion-valued weights (default “glorot”).
weight_init (str, optional) – (quaternion, unitary). This parameter defines the initialization procedure of the quaternion-valued weights. “quaternion” will generate random quaternion weights following the init_criterion and the quaternion polar form. “unitary” will normalize the weights to lie on the unit circle (default “quaternion”). More details in: “Quaternion Recurrent Neural Networks”, Parcollet T. et al.
autograd (bool, optional) – When True, the default PyTorch autograd will be used. When False, a custom backpropagation will be used, reducing by a factor 3 to 4 the memory consumption. It is also 2x slower (default True).

Example

>>> inp_tensor = torch.rand([10, 16, 40])
>>> rnn = QLSTM(hidden_size=16, input_shape=inp_tensor.shape)
>>> out_tensor = rnn(inp_tensor)
>>>
torch.Size([10, 16, 64])

forward(x, hx: Tensor | None = None)[source]

Returns the output of the vanilla QuaternionRNN.

Parameters:

x (torch.Tensor) – Input tensor.
hx (torch.Tensor) – Hidden layer.

Returns:

output (torch.Tensor) – Output of Quaternion RNN
hh (torch.Tensor) – Hidden states

class speechbrain.nnet.quaternion_networks.q_RNN.QLSTM_Layer(input_size, hidden_size, num_layers, batch_size, dropout=0.0, bidirectional=False, init_criterion='glorot', weight_init='quaternion', autograd='true')[source]

Bases: Module

This function implements quaternion-valued LSTM layer.

Parameters:

input_size (int) – Feature dimensionality of the input tensors (in term of real values).
hidden_size (int) – Number of output values (in term of real values).
num_layers (int, optional) – Number of layers to employ in the RNN architecture (default 1).
batch_size (int) – Batch size of the input tensors.
dropout (float, optional) – It is the dropout factor (must be between 0 and 1) (default 0.0).
bidirectional (bool, optional) – If True, a bidirectional model that scans the sequence both right-to-left and left-to-right is used (default False).
init_criterion (str , optional) – (glorot, he). This parameter controls the initialization criterion of the weights. It is combined with weights_init to build the initialization method of the quaternion-valued weights (default “glorot”).
weight_init (str, optional) – (quaternion, unitary). This parameter defines the initialization procedure of the quaternion-valued weights. “quaternion” will generate random quaternion weights following the init_criterion and the quaternion polar form. “unitary” will normalize the weights to lie on the unit circle (default “quaternion”). More details in: “Quaternion Recurrent Neural Networks”, Parcollet T. et al.
autograd (bool, optional) – When True, the default PyTorch autograd will be used. When False, a custom backpropagation will be used, reducing by a factor 3 to 4 the memory consumption. It is also 2x slower (default True).

forward(x: Tensor, hx: Tensor | None = None) → Tensor[source]

Returns the output of the QuaternionRNN_layer.

Parameters:

x (torch.Tensor) – Input tensor.
hx (torch.Tensor) – Hidden layer.

Returns:

h – The output of the Quaternion RNN layer.

Return type:

torch.Tensor

class speechbrain.nnet.quaternion_networks.q_RNN.QRNN(hidden_size, input_shape, nonlinearity='tanh', num_layers=1, bias=True, dropout=0.0, bidirectional=False, init_criterion='glorot', weight_init='quaternion', autograd=True)[source]

Bases: Module

This function implements a vanilla quaternion-valued RNN.

Input format is (batch, time, fea) or (batch, time, fea, channel). In the latter shape, the two last dimensions will be merged: (batch, time, fea * channel)

Parameters:

hidden_size (int) – Number of output neurons (i.e, the dimensionality of the output). Specified value is in term of quaternion-valued neurons. Thus, the output is 4*hidden_size.
input_shape (tuple) – Expected shape of the input tensor.
nonlinearity (str, optional) – Type of nonlinearity (tanh, relu) (default “tanh”).
num_layers (int, optional) – Number of layers to employ in the RNN architecture (default 1).
bias (bool, optional) – If True, the additive bias b is adopted (default True).
dropout (float, optional) – It is the dropout factor (must be between 0 and 1) (default 0.0).
bidirectional (bool, optional) – If True, a bidirectional model that scans the sequence both right-to-left and left-to-right is used (default False).
init_criterion (str , optional) – (glorot, he). This parameter controls the initialization criterion of the weights. It is combined with weights_init to build the initialization method of the quaternion-valued weights (default “glorot”).
weight_init (str, optional) – (quaternion, unitary). This parameter defines the initialization procedure of the quaternion-valued weights. “quaternion” will generate random quaternion weights following the init_criterion and the quaternion polar form. “unitary” will normalize the weights to lie on the unit circle (default “quaternion”). More details in: “Quaternion Recurrent Neural Networks”, Parcollet T. et al.
autograd (bool, optional) – When True, the default PyTorch autograd will be used. When False, a custom backpropagation will be used, reducing by a factor 3 to 4 the memory consumption. It is also 2x slower (default True).

Example

>>> inp_tensor = torch.rand([10, 16, 40])
>>> rnn = QRNN(hidden_size=16, input_shape=inp_tensor.shape)
>>> out_tensor = rnn(inp_tensor)
>>>
torch.Size([10, 16, 64])

forward(x, hx: Tensor | None = None)[source]

Returns the output of the vanilla QuaternionRNN.

Parameters:

x (torch.Tensor) – Input tensor.
hx (torch.Tensor) – Hidden layer.

Returns:

output (torch.Tensor)
hh (torch.Tensor)

class speechbrain.nnet.quaternion_networks.q_RNN.QRNN_Layer(input_size, hidden_size, num_layers, batch_size, dropout=0.0, nonlinearity='tanh', bidirectional=False, init_criterion='glorot', weight_init='quaternion', autograd='true')[source]

Bases: Module

This function implements quaternion-valued recurrent layer.

Parameters:

input_size (int) – Feature dimensionality of the input tensors (in term of real values).
hidden_size (int) – Number of output values (in term of real values).
num_layers (int, optional) – Number of layers to employ in the RNN architecture (default 1).
batch_size (int) – Batch size of the input tensors.
dropout (float, optional) – It is the dropout factor (must be between 0 and 1) (default 0.0).
nonlinearity (str, optional) – Type of nonlinearity (tanh, relu) (default “tanh”).
bidirectional (bool, optional) – If True, a bidirectional model that scans the sequence both right-to-left and left-to-right is used (default False).
init_criterion (str , optional) – (glorot, he). This parameter controls the initialization criterion of the weights. It is combined with weights_init to build the initialization method of the quaternion-valued weights (default “glorot”).
weight_init (str, optional) – (quaternion, unitary). This parameter defines the initialization procedure of the quaternion-valued weights. “quaternion” will generate random quaternion weights following the init_criterion and the quaternion polar form. “unitary” will normalize the weights to lie on the unit circle (default “quaternion”). More details in: “Quaternion Recurrent Neural Networks”, Parcollet T. et al.
autograd (bool, optional) – When True, the default PyTorch autograd will be used. When False, a custom backpropagation will be used, reducing by a factor 3 to 4 the memory consumption. It is also 2x slower (default True).

forward(x: Tensor, hx: Tensor | None = None) → Tensor[source]

Returns the output of the QuaternionRNN_layer.

Parameters:

x (torch.Tensor) – Input tensor.
hx (torch.Tensor) – Hidden layer.

Returns:

h – Output of the Quaternion RNN

Return type:

torch.Tensor

class speechbrain.nnet.quaternion_networks.q_RNN.QLiGRU(hidden_size, input_shape, nonlinearity='leaky_relu', num_layers=1, bias=True, dropout=0.0, bidirectional=False, init_criterion='glorot', weight_init='quaternion', autograd=True)[source]

Bases: Module

This function implements a quaternion-valued Light GRU (liGRU).

Ligru is single-gate GRU model based on batch-norm + relu activations + recurrent dropout. For more info see:

“M. Ravanelli, P. Brakel, M. Omologo, Y. Bengio, Light Gated Recurrent Units for Speech Recognition, in IEEE Transactions on Emerging Topics in Computational Intelligence, 2018” (https://arxiv.org/abs/1803.10225)

To speed it up, it is compiled with the torch just-in-time compiler (jit) right before using it.

It accepts in input tensors formatted as (batch, time, fea). In the case of 4d inputs like (batch, time, fea, channel) the tensor is flattened as (batch, time, fea*channel).

Parameters:

hidden_size (int) – Number of output neurons (i.e, the dimensionality of the output). Specified value is in term of quaternion-valued neurons. Thus, the output is 2*hidden_size.
input_shape (tuple) – Expected shape of the input.
nonlinearity (str) – Type of nonlinearity (tanh, relu).
num_layers (int) – Number of layers to employ in the RNN architecture.
bias (bool) – If True, the additive bias b is adopted.
dropout (float) – It is the dropout factor (must be between 0 and 1).
bidirectional (bool) – If True, a bidirectional model that scans the sequence both right-to-left and left-to-right is used.
init_criterion (str, optional) – (glorot, he). This parameter controls the initialization criterion of the weights. It is combined with weights_init to build the initialization method of the quaternion-valued weights (default “glorot”).
weight_init (str, optional) – (quaternion, unitary). This parameter defines the initialization procedure of the quaternion-valued weights. “quaternion” will generate random quaternion-valued weights following the init_criterion and the quaternion polar form. “unitary” will normalize the weights to lie on the unit circle (default “quaternion”). More details in: “Deep quaternion Networks”, Trabelsi C. et al.
autograd (bool, optional) – When True, the default PyTorch autograd will be used. When False, a custom backpropagation will be used, reducing by a factor 3 to 4 the memory consumption. It is also 2x slower (default True).

Example

>>> inp_tensor = torch.rand([10, 16, 40])
>>> rnn = QLiGRU(input_shape=inp_tensor.shape, hidden_size=16)
>>> out_tensor = rnn(inp_tensor)
>>>
torch.Size([4, 10, 5])

forward(x, hx: Tensor | None = None)[source]

Returns the output of the QuaternionliGRU.

Parameters:

x (torch.Tensor) – Input tensor.
hx (torch.Tensor) – Hidden layer.

Returns:

output (torch.Tensor)
hh (torch.Tensor)

class speechbrain.nnet.quaternion_networks.q_RNN.QLiGRU_Layer(input_size, hidden_size, num_layers, batch_size, dropout=0.0, nonlinearity='leaky_relu', normalization='batchnorm', bidirectional=False, init_criterion='glorot', weight_init='quaternion', autograd=True)[source]

Bases: Module

This function implements quaternion-valued Light-Gated Recurrent Units (ligru) layer.

Parameters:

input_size (int) – Feature dimensionality of the input tensors.
hidden_size (int) – Number of output values.
num_layers (int) – Number of layers to employ in the RNN architecture.
batch_size (int) – Batch size of the input tensors.
dropout (float) – It is the dropout factor (must be between 0 and 1).
nonlinearity (str) – Type of nonlinearity (tanh, relu).
normalization (str) – The type of normalization to use (batchnorm or none)
bidirectional (bool) – If True, a bidirectional model that scans the sequence both right-to-left and left-to-right is used.
init_criterion (str , optional) – (glorot, he). This parameter controls the initialization criterion of the weights. It is combined with weights_init to build the initialization method of the quaternion-valued weights (default “glorot”).
weight_init (str, optional) – (quaternion, unitary). This parameter defines the initialization procedure of the quaternion-valued weights. “quaternion” will generate random quaternion weights following the init_criterion and the quaternion polar form. “unitary” will normalize the weights to lie on the unit circle (default “quaternion”). More details in: “Deep quaternion Networks”, Trabelsi C. et al.
autograd (bool, optional) – When True, the default PyTorch autograd will be used. When False, a custom backpropagation will be used, reducing by a factor 3 to 4 the memory consumption. It is also 2x slower (default True).

forward(x: Tensor, hx: Tensor | None = None) → Tensor[source]

Returns the output of the quaternion liGRU layer.

Parameters:

x (torch.Tensor) – Input tensor.
hx (torch.Tensor)

Return type:

Output of quaternion liGRU layer.