Shortcuts

DNN building blocks

Mask estimators

Ready-to-use

asteroid.masknn.blocks.TDConvNet(*args, **kwargs)[source]

Temporal Convolutional network used in ConvTasnet.

Parameters:
  • in_chan (int) – Number of input filters.
  • n_src (int) – Number of masks to estimate.
  • out_chan (int, optional) – Number of bins in the estimated masks. If None, out_chan = in_chan.
  • n_blocks (int, optional) – Number of convolutional blocks in each repeat. Defaults to 8.
  • n_repeats (int, optional) – Number of repeats. Defaults to 3.
  • bn_chan (int, optional) – Number of channels after the bottleneck.
  • hid_chan (int, optional) – Number of channels in the convolutional blocks.
  • skip_chan (int, optional) – Number of channels in the skip connections. If 0 or None, TDConvNet won’t have any skip connections and the masks will be computed from the residual output. Corresponds to the ConvTasnet architecture in v1 or the paper.
  • conv_kernel_size (int, optional) – Kernel size in convolutional blocks.
  • norm_type (str, optional) – To choose from 'BN', 'gLN', 'cLN'.
  • mask_act (str, optional) – Which non-linear function to generate mask.

References

[1] : “Conv-TasNet: Surpassing ideal time-frequency magnitude masking for speech separation” TASLP 2019 Yi Luo, Nima Mesgarani https://arxiv.org/abs/1809.07454

asteroid.masknn.blocks.DPRNN(*args, **kwargs)[source]
Dual-path RNN Network for Single-Channel Source Separation
introduced in [1].
Parameters:
  • in_chan (int) – Number of input filters.
  • n_src (int) – Number of masks to estimate.
  • out_chan (int or None) – Number of bins in the estimated masks. Defaults to in_chan.
  • bn_chan (int) – Number of channels after the bottleneck. Defaults to 128.
  • hid_size (int) – Number of neurons in the RNNs cell state. Defaults to 128.
  • chunk_size (int) – window size of overlap and add processing. Defaults to 100.
  • hop_size (int or None) – hop size (stride) of overlap and add processing. Default to chunk_size // 2 (50% overlap).
  • n_repeats (int) – Number of repeats. Defaults to 6.
  • norm_type (str, optional) –

    Type of normalization to use. To choose from

    • 'gLN': global Layernorm
    • 'cLN': channelwise Layernorm
  • mask_act (str, optional) – Which non-linear function to generate mask.
  • bidirectional (bool, optional) – True for bidirectional Inter-Chunk RNN (Intra-Chunk is always bidirectional).
  • rnn_type (str, optional) – Type of RNN used. Choose between 'RNN', 'LSTM' and 'GRU'.
  • num_layers (int, optional) – Number of layers in each RNN.
  • dropout (float, optional) – Dropout ratio, must be in [0,1].

References

[1] “Dual-path RNN: efficient long sequence modeling for
time-domain single-channel speech separation”, Yi Luo, Zhuo Chen and Takuya Yoshioka. https://arxiv.org/abs/1910.06379

Layers

asteroid.masknn.blocks.Conv1DBlock(*args, **kwargs)[source]

One dimensional convolutional block, as proposed in [1].

Parameters:
  • in_chan (int) – Number of input channels.
  • hid_chan (int) – Number of hidden channels in the depth-wise convolution.
  • skip_out_chan (int) – Number of channels in the skip convolution. If 0 or None, Conv1DBlock won’t have any skip connections. Corresponds to the the block in v1 or the paper. The forward return res instead of [res, skip] in this case.
  • kernel_size (int) – Size of the depth-wise convolutional kernel.
  • padding (int) – Padding of the depth-wise convolution.
  • dilation (int) – Dilation of the depth-wise convolution.
  • norm_type (str, optional) –

    Type of normalization to use. To choose from

    • 'gLN': global Layernorm
    • 'cLN': channelwise Layernorm
    • 'cgLN': cumulative global Layernorm

References

[1] : “Conv-TasNet: Surpassing ideal time-frequency magnitude masking for speech separation” TASLP 2019 Yi Luo, Nima Mesgarani https://arxiv.org/abs/1809.07454

asteroid.masknn.blocks.SingleRNN(*args, **kwargs)[source]

Module for a RNN block.

Inspired from https://github.com/yluo42/TAC/blob/master/utility/models.py Licensed under CC BY-NC-SA 3.0 US.

Parameters:
  • rnn_type (str) – Select from 'RNN', 'LSTM', 'GRU'. Can also be passed in lowercase letters.
  • input_size (int) – Dimension of the input feature. The input should have shape [batch, seq_len, input_size].
  • hidden_size (int) – Dimension of the hidden state.
  • n_layers (int, optional) – Number of layers used in RNN. Default is 1.
  • dropout (float, optional) – Dropout ratio. Default is 0.
  • bidirectional (bool, optional) – Whether the RNN layers are bidirectional. Default is False.
asteroid.masknn.blocks.DPRNNBlock(*args, **kwargs)[source]

Dual-Path RNN Block as proposed in [1].

Parameters:
  • in_chan (int) – Number of input channels.
  • hid_size (int) – Number of hidden neurons in the RNNs.
  • norm_type (str, optional) – Type of normalization to use. To choose from - 'gLN': global Layernorm - 'cLN': channelwise Layernorm
  • bidirectional (bool, optional) – True for bidirectional Inter-Chunk RNN.
  • rnn_type (str, optional) – Type of RNN used. Choose from 'RNN', 'LSTM' and 'GRU'.
  • num_layers (int, optional) – Number of layers used in each RNN.
  • dropout (float, optional) – Dropout ratio. Must be in [0, 1].

References

[1] “Dual-path RNN: efficient long sequence modeling for time-domain single-channel speech separation”, Yi Luo, Zhuo Chen and Takuya Yoshioka. https://arxiv.org/abs/1910.06379

Normalization layers

asteroid.masknn.norms.GlobLN(*args, **kwargs)[source]

Global Layer Normalization (globLN).

asteroid.masknn.norms.ChanLN(*args, **kwargs)[source]

Channel-wise Layer Normalization (chanLN).

asteroid.masknn.norms.CumLN(*args, **kwargs)[source]

Cumulative Global layer normalization(cumLN).

asteroid.masknn.norms.get(identifier)[source]

Returns a norm class from a string. Returns its input if it is callable (already a _LayerNorm for example).

Parameters:identifier (str or Callable or None) – the norm identifier.
Returns:_LayerNorm or None

Activation layers

asteroid.masknn.activations.get(identifier)[source]

Returns an activation function from a string. Returns its input if it is callable (already an activation for example).

Parameters:identifier (str or Callable or None) – the activation identifier.
Returns:nn.Module or None
asteroid.masknn.activations.linear()[source]
asteroid.masknn.activations.relu()[source]
asteroid.masknn.activations.prelu()[source]
asteroid.masknn.activations.leaky_relu()[source]
asteroid.masknn.activations.sigmoid()[source]
asteroid.masknn.activations.tanh()[source]
asteroid.masknn.activations.gelu()[source]
asteroid.masknn.activations.swish()[source]
Read the Docs v: v0.3.4
Versions
latest
stable
v0.3.4
v0.3.3
v0.3.2
v0.3.1
Downloads
On Read the Docs
Project Home
Builds

Free document hosting provided by Read the Docs.