arxiv:2007.06833

Sudo rm -rf: Efficient Networks for Universal Audio Source Separation

Published on Jul 14, 2020

Authors:

Abstract

Efficient convolutional neural network architecture for audio source separation that achieves high quality results with reduced computational requirements through successive downsampling and resampling of multi-resolution features.

AI-generated summary

In this paper, we present an efficient neural network for end-to-end general purpose audio source separation. Specifically, the backbone structure of this convolutional network is the SUccessive DOwnsampling and Resampling of Multi-Resolution Features (SuDoRMRF) as well as their aggregation which is performed through simple one-dimensional convolutions. In this way, we are able to obtain high quality audio source separation with limited number of floating point operations, memory requirements, number of parameters and latency. Our experiments on both speech and environmental sound separation datasets show that SuDoRMRF performs comparably and even surpasses various state-of-the-art approaches with significantly higher computational resource requirements.

View arXiv page View PDF Add to collection

Community

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment

Upvote

Models citing this paper 0

No model linking this paper

Cite arxiv.org/abs/2007.06833 in a model README.md to link it from this page.

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2007.06833 in a dataset README.md to link it from this page.

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2007.06833 in a Space README.md to link it from this page.

Collections including this paper 0

No Collection including this paper

Add this paper to a collection to link it from this page.