Wav2Vec 2.0 Revealed-Create ASR with 10 Minute Voice
After performing representation training with 53,000 hours of label-free data, a pre-trained model for Facebook's wav2vec 2.0, which became a hot topic because it created a speech recognizer with only 10 minutes of labeled data, was released. No fine-tuning in the representation model,...