asteroid.models.lstm_tasnet module¶
-
class
asteroid.models.lstm_tasnet.
LSTMTasNet
(n_src, out_chan=None, rnn_type='lstm', n_layers=4, hid_size=512, dropout=0.3, mask_act='sigmoid', bidirectional=True, in_chan=None, fb_name='free', n_filters=64, kernel_size=16, stride=8, encoder_activation=None, **fb_kwargs)[source]¶ Bases:
asteroid.models.base_models.BaseEncoderMaskerDecoder
TasNet separation model, as described in [1].
Parameters: - n_src (int) – Number of masks to estimate.
- out_chan (int or None) – Number of bins in the estimated masks. Defaults to in_chan.
- hid_size (int) – Number of neurons in the RNNs cell state. Defaults to 128.
- mask_act (str, optional) – Which non-linear function to generate mask.
- bidirectional (bool, optional) – True for bidirectional Inter-Chunk RNN (Intra-Chunk is always bidirectional).
- rnn_type (str, optional) – Type of RNN used. Choose between
'RNN'
,'LSTM'
and'GRU'
. - n_layers (int, optional) – Number of layers in each RNN.
- dropout (float, optional) – Dropout ratio, must be in [0,1].
- in_chan (int, optional) – Number of input channels, should be equal to n_filters.
- fb_name (str, className) – Filterbank family from which to make encoder
and decoder. To choose among [
'free'
,'analytic_free'
,'param_sinc'
,'stft'
]. - n_filters (int) – Number of filters / Input dimension of the masker net.
- kernel_size (int) – Length of the filters.
- stride (int, optional) – Stride of the convolution.
If None (default), set to
kernel_size // 2
. - **fb_kwargs (dict) – Additional kwards to pass to the filterbank creation.
References
- [1]: Yi Luo et al. “Real-time Single-channel Dereverberation and Separation
- with Time-domain Audio Separation Network”, Interspeech 2018