asteroid.data.wham_dataset module¶
-
class
asteroid.data.wham_dataset.
WhamDataset
(json_dir, task, sample_rate=8000, segment=4.0, nondefault_nsrc=None, normalize_audio=False)[source]¶ Bases:
sphinx.ext.autodoc.importer._MockObject
Dataset class for WHAM source separation and speech enhancement tasks.
Parameters: - json_dir (str) – The path to the directory containing the json files.
- task (str) –
One of
'enh_single'
,'enh_both'
,'sep_clean'
or'sep_noisy'
.'enh_single'
for single speaker speech enhancement.'enh_both'
for multi speaker speech enhancement.'sep_clean'
for two-speaker clean source separation.'sep_noisy'
for two-speaker noisy source separation.
- sample_rate (int, optional) – The sampling rate of the wav files.
- segment (float, optional) – Length of the segments used for training, in seconds. If None, use full utterances (e.g. for test).
- nondefault_nsrc (int, optional) – Number of sources in the training targets. If None, defaults to one for enhancement tasks and two for separation tasks.
- normalize_audio (bool) – If True then both sources and the mixture are normalized with the standard deviation of the mixture.
- References
- “WHAM!: Extending Speech Separation to Noisy Environments”, Wichern et al. 2019