Shortcuts

asteroid.masknn.tac module

class asteroid.masknn.tac.TAC(input_dim, hidden_dim=384, activation='prelu', norm_type='gLN')[source]

Bases: sphinx.ext.autodoc.importer._MockObject

Transform-Average-Concatenate inter-microphone-channel permutation invariant communication block [1].

Parameters:
  • input_dim (int) – Number of features of input representation.
  • hidden_dim (int, optional) – size of hidden layers in TAC operations.
  • activation (str, optional) – type of activation used. See asteroid.masknn.activations.
  • norm_type (str, optional) – type of normalization layer used. See asteroid.masknn.norms.

Note

Supports inputs of shape \((batch, mic\_channels, features, chunk\_size, n\_chunks)\) as in FasNet-TAC. The operations are applied for each element in chunk_size and n_chunks. Output is of same shape as input.

References
[1] : Luo, Yi, et al. “End-to-end microphone permutation and number invariant multi-channel speech separation.” ICASSP 2020.
forward(x, valid_mics=None)[source]
Parameters:
  • x – (torch.Tensor): Input multi-channel DPRNN features. Shape: \((batch, mic\_channels, features, chunk\_size, n\_chunks)\).
  • valid_mics – (torch.LongTensor): tensor containing effective number of microphones on each batch. Batches can be composed of examples coming from arrays with a different number of microphones and thus the mic_channels dimension is padded. E.g. torch.tensor([4, 3]) means first example has 4 channels and the second 3. Shape: :math`(batch)`.
Returns:

output (torch.Tensor) –

features for each mic_channel after TAC inter-channel processing.

Shape \((batch, mic\_channels, features, chunk\_size, n\_chunks)\).

Read the Docs v: v0.4.4
Versions
latest
stable
v0.4.4
v0.4.3
v0.4.2
v0.4.1
v0.4.0
v0.3.5_b
v0.3.4
v0.3.3
v0.3.2
v0.3.1
Downloads
On Read the Docs
Project Home
Builds

Free document hosting provided by Read the Docs.