asteroid.models.dcunet module¶

class asteroid.models.dcunet.BaseDCUNet(architecture, stft_kernel_size=512, stft_stride=None, masknet_kwargs=None)[source]¶

Base class for DCUNet and DCCRNet classes.

Parameters:	stft_kernel_size (int) – STFT frame length to use stft_stride (int, optional) – STFT hop length to use.

postprocess_encoded(tf_rep)[source]¶

Hook to perform transformations on the encoded, time-frequency domain representation (output of the encoder) before encoder activation is applied.

Parameters:	tf_rep (Tensor of shape (batch, freq, time)) – Output of the encoder, before encoder activation is applied.
Returns:	Transformed tf_rep

postprocess_masked(masked_tf_rep)[source]¶

Hook to perform transformations on the masked time-frequency domain representation (result of masking in the time-frequency domain) before decoding.

Parameters:	masked_tf_rep (Tensor of shape (batch, n_src, freq, time)) – Masked time-frequency representation, before decoding.
Returns:	Transformed masked_tf_rep

class asteroid.models.dcunet.DCUNet(architecture, stft_kernel_size=512, stft_stride=None, masknet_kwargs=None)[source]¶

DCUNet as proposed in [1].

Parameters:	architecture (str) – The architecture to use, any of “DCUNet-10”, “DCUNet-16”, “DCUNet-20”, “Large-DCUNet-20”. stft_kernel_size (int) – STFT frame length to use stft_stride (int, optional) – STFT hop length to use.

References

[1] : “Phase-aware Speech Enhancement with Deep Complex U-Net”, Hyeong-Seok Choi et al. https://arxiv.org/abs/1903.03107