asteroid.masknn.tac module¶
-
class
asteroid.masknn.tac.
TAC
(input_dim, hidden_dim=384, activation='prelu', norm_type='gLN')[source]¶ Bases:
sphinx.ext.autodoc.importer._MockObject
Transform-Average-Concatenate inter-microphone-channel permutation invariant communication block [1].
Parameters: - input_dim (int) – Number of features of input representation.
- hidden_dim (int, optional) – size of hidden layers in TAC operations.
- activation (str, optional) – type of activation used. See asteroid.masknn.activations.
- norm_type (str, optional) – type of normalization layer used. See asteroid.masknn.norms.
Note
Supports inputs of shape \((batch, mic\_channels, features, chunk\_size, n\_chunks)\) as in FasNet-TAC. The operations are applied for each element in
chunk_size
andn_chunks
. Output is of same shape as input.- References
- [1] : Luo, Yi, et al. “End-to-end microphone permutation and number invariant multi-channel speech separation.” ICASSP 2020.
-
forward
(x, valid_mics=None)[source]¶ Parameters: - x – (
torch.Tensor
): Input multi-channel DPRNN features. Shape: \((batch, mic\_channels, features, chunk\_size, n\_chunks)\). - valid_mics – (
torch.LongTensor
): tensor containing effective number of microphones on each batch. Batches can be composed of examples coming from arrays with a different number of microphones and thus themic_channels
dimension is padded. E.g. torch.tensor([4, 3]) means first example has 4 channels and the second 3. Shape: :math`(batch)`.
Returns: output (
torch.Tensor
) –- features for each mic_channel after TAC inter-channel processing.
Shape \((batch, mic\_channels, features, chunk\_size, n\_chunks)\).
- x – (