asteroid.dsp package¶
-
class
asteroid.dsp.
LambdaOverlapAdd
(nnet, n_src, window_size, hop_size=None, window='hanning', reorder_chunks=True, enable_grad=False)[source]¶ Bases:
sphinx.ext.autodoc.importer._MockObject
Segment signal, apply func, combine with OLA.
Parameters: - nnet (callable) – function to apply to each segment.
- n_src (int) – Number of sources in the output of nnet.
- window_size (int) – Size of segmenting window.
- hop_size (int) – segmentation hop size.
- window (str) – Name of the window (see scipy.signal.get_window)
- reorder_chunks (bool) – whether to reorder each consecutive segment.
-
forward
(x)[source]¶ Forward module: segment signal, apply func, combine with OLA.
Parameters: x ( torch.Tensor
) – waveform signal of shape (batch, 1, time).Returns: torch.Tensor
– The output of the lambda OLA.
-
class
asteroid.dsp.
DualPathProcessing
(chunk_size, hop_size)[source]¶ Bases:
sphinx.ext.autodoc.importer._MockObject
Perform Dual-Path processing via overlap-add as in DPRNN [1].
- Args:
- chunk_size (int): Size of segmenting window. hop_size (int): segmentation hop size.
References
- [1] “Dual-path RNN: efficient long sequence modeling for
- time-domain single-channel speech separation”, Yi Luo, Zhuo Chen and Takuya Yoshioka. https://arxiv.org/abs/1910.06379
-
fold
(x, output_size=None)[source]¶ Folds back the spliced feature tensor.
Input shape (batch, channels, chunk_size, n_chunks) to original shape (batch, channels, time) using overlap-add.
Parameters: - x – (
torch.Tensor
): spliced feature tensor of shape (batch, channels, chunk_size, n_chunks). - output_size – (int, optional): sequence length of original feature tensor. If None, the original length cached by the previous call of unfold will be used.
Returns: x – (
torch.Tensor
): feature tensor of shape (batch, channels, time).Note
fold caches the original length of the pr
- x – (
-
static
inter_process
(x, module)[source]¶ Performs inter-chunk processing.
Parameters: - x (
torch.Tensor
) – spliced feature tensor of shape (batch, channels, chunk_size, n_chunks). - module (
torch.nn.Module
) – module one wish to apply between each chunk of the spliced feature tensor.
Returns: x (
torch.Tensor
) –- processed spliced feature tensor of shape
(batch, channels, chunk_size, n_chunks).
Note
the module should have the channel first convention and accept a 3D tensor of shape (batch, channels, time).
- x (
-
static
intra_process
(x, module)[source]¶ Performs intra-chunk processing.
Parameters: - x (
torch.Tensor
) – spliced feature tensor of shape (batch, channels, chunk_size, n_chunks). - module (
torch.nn.Module
) – module one wish to apply to each chunk of the spliced feature tensor.
Returns: x (
torch.Tensor
) –- processed spliced feature tensor of shape
(batch, channels, chunk_size, n_chunks).
Note
the module should have the channel first convention and accept a 3D tensor of shape (batch, channels, time).
- x (
-
unfold
(x)[source]¶ Unfold the feature tensor from
(batch, channels, time) to (batch, channels, chunk_size, n_chunks).
Parameters: x – ( torch.Tensor
): feature tensor of shape (batch, channels, time).Returns: x – - (
torch.Tensor
): spliced feature tensor of shape - (batch, channels, chunk_size, n_chunks).
- (
-
asteroid.dsp.
mixture_consistency
(mixture, est_sources, src_weights=None, dim=1)[source]¶ Applies mixture consistency to a tensor of estimated sources.
- Args
mixture (torch.Tensor): Mixture waveform or TF representation. est_sources (torch.Tensor): Estimated sources waveforms or TF
representations.- src_weights (torch.Tensor): Consistency weight for each source.
- Shape needs to be broadcastable to est_source. We make sure that the weights sum up to 1 along dim dim. If src_weights is None, compute them based on relative power.
dim (int): Axis which contains the sources in est_sources.
- Returns
- torch.Tensor with same shape as est_sources, after applying mixture consistency.
- Notes
- This method can be used only in ‘complete’ separation tasks, otherwise the residual error will contain unwanted sources. For example, this won’t work with the task sep_noisy from WHAM.
- Examples
>>> # Works on waveforms >>> mix = torch.randn(10, 16000) >>> est_sources = torch.randn(10, 2, 16000) >>> new_est_sources = mixture_consistency(mix, est_sources, dim=1) >>> # Also works on spectrograms >>> mix = torch.randn(10, 514, 400) >>> est_sources = torch.randn(10, 2, 514, 400) >>> new_est_sources = mixture_consistency(mix, est_sources, dim=1)
- References
- Scott Wisdom, John R Hershey, Kevin Wilson, Jeremy Thorpe, Michael Chinen, Brian Patton, and Rif A Saurous. “Differentiable consistency constraints for improved deep speech enhancement”, ICASSP 2019.