MixIT AI, unveiled by Google, is a technology that obtains a separate sound source from single channel audio in which multiple sound sources are mixed. It can be viewed as a blind source separation task, and unlike existing technologies, it is characterized by excellent performance with unsupervised(!).
The recent trend is clear. Unsupervised, self-supervised, semi-supervised, the names are slightly different, but in the end what you want is to catch up with performance by using unlabeled data on a large scale, rather than having a medium-sized labeled data. And adding more data there or achieving SOTA with very little labeling data.
Google's MixIT AI isolates speakers in audio recordings
In a new paper, researchers at Google describe an AI system that isolates speakers' voices in audio recordings using an unsupervised approach.