asteroid.data.avspeech_dataset module¶
-
class
asteroid.data.avspeech_dataset.
Signal
(video_path: Union[str, pathlib.Path], audio_path: Union[str, pathlib.Path], embed_dir: Union[str, pathlib.Path], sr=16000, video_start_length=0, fps=25, signal_len=3)[source]¶ Bases:
object
This class holds the video frames and the audio signal.
Parameters: Note
each video consists of multiple parts which consists of fps*signal_len frames.
-
class
asteroid.data.avspeech_dataset.
AVSpeechDataset
(input_df_path: Union[str, pathlib.Path], embed_dir: Union[str, pathlib.Path], n_src=2)[source]¶ Bases:
sphinx.ext.autodoc.importer._MockObject
Audio Visual Speech Separation dataset as described in [1].
Parameters: - References
- [1] “Looking to Listen at the Cocktail Party: A Speaker-Independent Audio-Visual Model for Speech Separation” Ephrat et. al https://arxiv.org/abs/1804.03619