VisualSpeechCode Facebook Denoiser: real-time speech enhancement

Wav2Lip: generate lip motion from voice

LipGAN is a technology that generates the shape of the lips of a face image using a voice signal, and when it is actually applied to a video, it was somewhat disappointing in terms of visual artifacts and the naturalness of movement. To improve this, the discriminator is not a single frame, but a plurality of consecutive…

SpeechCode Facebook Denoiser: real-time speech enhancement

FastSpeech2 Open Source

TensorflowTTS, an open source based on Tensorflow 2 that supports several latest TTS models such as Tacotron2, MelGan, FastSpeech, etc., has finally begun supporting Microsoft FastSpeech2. FastSpeech2 shows similar performance to Transformer series TTS, but takes more than twice the time to learn…