This off-line webpage organizes our denoising audio results. We first show the comparison results on AudioSet dataset, DEMAND dataset and real-world audio recordings. When demonstrating with real-world recordings, we also include scenarios in which audiovisual denoising would fail because of the lack of frontal faces in video footage as well as scenarios in which multiple persons speak. Both are common scenarios in daily life. We then present examples of our denoising results on other languages, which are all resulted from our model trained solely in English. Lastly, we show two clips from the song "The Sound of Silence", as an echo of our title, to further demonstrate our model's ability to reduce non-stationary noise like music.
Here we show denoising results on synthetic input signals. The input signals are generated using audio clips in AVSPEECH as foreground speech and in AudioSet as background noise. Under seven different input SNRs (from -10dB to 10dB), we compare the denosing results of our model with other methods.
Please note that here the Ours-GTSI method uses ground-truth silent interval labels. It is by no means a practical approach. But rather, it is meant to to show the best possible (upper-bound) denoising quality when the silent intervals is perfectly predicted.
Results | Audio waveform |
---|---|
Noisy Sample | |
Clean Sample | |
Ours-GTSI | |
Ours | |
Spectral Gating | |
Adobe Audition | |
DFL | |
SEGAN | |
Baseline-thres | |
VSE |
Results | Audio waveform |
---|---|
Noisy Sample | |
Clean Sample | |
Ours-GTSI | |
Ours | |
Spectral Gating | |
Adobe Audition | |
DFL | |
SEGAN | |
Baseline-thres | |
VSE |
Results | Audio waveform |
---|---|
Noisy Sample | |
Clean Sample | |
Ours-GTSI | |
Ours | |
Spectral Gating | |
Adobe Audition | |
DFL | |
SEGAN | |
Baseline-thres | |
VSE |
Results | Audio waveform |
---|---|
Noisy Sample | |
Clean Sample | |
Ours-GTSI | |
Ours | |
Spectral Gating | |
Adobe Audition | |
DFL | |
SEGAN | |
Baseline-thres | |
VSE |
Results | Audio waveform |
---|---|
Noisy Sample | |
Clean Sample | |
Ours-GTSI | |
Ours | |
Spectral Gating | |
Adobe Audition | |
DFL | |
SEGAN | |
Baseline-thres | |
VSE |
Results | Audio waveform |
---|---|
Noisy Sample | |
Clean Sample | |
Ours-GTSI | |
Ours | |
Spectral Gating | |
Adobe Audition | |
DFL | |
SEGAN | |
Baseline-thres | |
VSE |
Results | Audio waveform |
---|---|
Noisy Sample | |
Clean Sample | |
Ours-GTSI | |
Ours | |
Spectral Gating | |
Adobe Audition | |
DFL | |
SEGAN | |
Baseline-thres | |
VSE |
Here we provide comparison results similar to the previous section. Instead of using AudioSet data as background noise, here we use DEMAND, another dataset used in previous denoising works, as the source of background noise. All other setups are the same as the previous section.
Results | Audio waveform |
---|---|
Noisy Sample | |
Clean Sample | |
Ours-GTSI | |
Ours | |
Spectral Gating | |
Adobe Audition | |
DFL | |
SEGAN | |
Baseline-thres | |
VSE |
Results | Audio waveform |
---|---|
Noisy Sample | |
Clean Sample | |
Ours-GTSI | |
Ours | |
Spectral Gating | |
Adobe Audition | |
DFL | |
SEGAN | |
Baseline-thres | |
VSE |
Results | Audio waveform |
---|---|
Noisy Sample | |
Clean Sample | |
Ours-GTSI | |
Ours | |
Spectral Gating | |
Adobe Audition | |
DFL | |
SEGAN | |
Baseline-thres | |
VSE |
Results | Audio waveform |
---|---|
Noisy Sample | |
Clean Sample | |
Ours-GTSI | |
Ours | |
Spectral Gating | |
Adobe Audition | |
DFL | |
SEGAN | |
Baseline-thres | |
VSE |
Results | Audio waveform |
---|---|
Noisy Sample | |
Clean Sample | |
Ours-GTSI | |
Ours | |
Spectral Gating | |
Adobe Audition | |
DFL | |
SEGAN | |
Baseline-thres | |
VSE |
Results | Audio waveform |
---|---|
Noisy Sample | |
Clean Sample | |
Ours-GTSI | |
Ours | |
Spectral Gating | |
Adobe Audition | |
DFL | |
SEGAN | |
Baseline-thres | |
VSE |
Results | Audio waveform |
---|---|
Noisy Sample | |
Clean Sample | |
Ours-GTSI | |
Ours | |
Spectral Gating | |
Adobe Audition | |
DFL | |
SEGAN | |
Baseline-thres | |
VSE |
Here we provide denoising results on real-world recordings, in comparison to other methods. These examples are recorded in video with a single person from front-face view. Only the (mono-channel) audio signals in the recordings are provided to the denoising methods, except VSE, which requires audiovisual input (i.e., video footage is also provided as input).
Results | Audio waveform |
---|---|
Noisy Sample | |
Ours | |
Spectral Gating | |
Adobe Audition | |
DFL | |
SEGAN | |
Baseline-thres | |
VSE |
Results | Audio waveform |
---|---|
Noisy Sample | |
Ours | |
Spectral Gating | |
Adobe Audition | |
DFL | |
SEGAN | |
Baseline-thres | |
VSE |
Results | Audio waveform |
---|---|
Noisy Sample | |
Ours | |
Spectral Gating | |
Adobe Audition | |
DFL | |
SEGAN | |
Baseline-thres | |
VSE |
Results | Audio waveform |
---|---|
Noisy Sample | |
Ours | |
Spectral Gating | |
Adobe Audition | |
DFL | |
SEGAN | |
Baseline-thres | |
VSE |
Results | Audio waveform |
---|---|
Noisy Sample | |
Ours | |
Spectral Gating | |
Adobe Audition | |
DFL | |
SEGAN | |
Baseline-thres | |
VSE |
Here we provide additional comparison results on examples from real world. VSE (the audiovisual method) results are excluded since it cannot successful extract mouth information from these examples.
Results | Audio waveform |
---|---|
Noisy Sample | |
Ours | |
Spectral Gating | |
Adobe Audition | |
DFL | |
SEGAN | |
Baseline-thres |
Results | Audio waveform |
---|---|
Noisy Sample | |
Ours | |
Spectral Gating | |
Adobe Audition | |
DFL | |
SEGAN | |
Baseline-thres |
Results | Audio waveform |
---|---|
Noisy Sample | |
Ours | |
Spectral Gating | |
Adobe Audition | |
DFL | |
SEGAN | |
Baseline-thres |
Results | Audio waveform |
---|---|
Noisy Sample | |
Ours | |
Spectral Gating | |
Adobe Audition | |
DFL | |
SEGAN | |
Baseline-thres |
Results | Audio waveform |
---|---|
Noisy Sample | |
Ours | |
Spectral Gating | |
Adobe Audition | |
DFL | |
SEGAN | |
Baseline-thres |
Results | Audio waveform |
---|---|
Noisy Sample | |
Ours | |
Spectral Gating | |
Adobe Audition | |
DFL | |
SEGAN | |
Baseline-thres |
Results | Audio waveform |
---|---|
Noisy Sample | |
Ours | |
Spectral Gating | |
Adobe Audition | |
DFL | |
SEGAN | |
Baseline-thres |
Results | Audio waveform |
---|---|
Noisy Sample | |
Ours | |
Spectral Gating | |
Adobe Audition | |
DFL | |
SEGAN | |
Baseline-thres |
Here we provide additional comparison results on examples from real world where there are more than one person talking. The denoising result is supposed to keep voices from every person.
Results | Audio waveform |
---|---|
Noisy Sample | |
Ours | |
Spectral Gating | |
Adobe Audition | |
DFL | |
SEGAN | |
Baseline-thres |
Results | Audio waveform |
---|---|
Noisy Sample | |
Ours | |
Spectral Gating | |
Adobe Audition | |
DFL | |
SEGAN | |
Baseline-thres |
Results | Audio waveform |
---|---|
Noisy Sample | |
Ours | |
Spectral Gating | |
Adobe Audition | |
DFL | |
SEGAN | |
Baseline-thres |
Results | Audio waveform |
---|---|
Noisy Sample | |
Ours | |
Spectral Gating | |
Adobe Audition | |
DFL | |
SEGAN | |
Baseline-thres |
Here we provide our denoising results on audio examples of other languages. Our silent interval detection results are also provided.
Results | Audio waveform |
---|---|
Noisy Sample | |
Denoised Result | |
Silent Intervals |
Results | Audio waveform |
---|---|
Noisy Sample | |
Denoised Result | |
Silent Intervals |
Results | Audio waveform |
---|---|
Noisy Sample | |
Denoised Result | |
Silent Intervals |
Results | Audio waveform |
---|---|
Noisy Sample | |
Denoised Result | |
Silent Intervals |
Results | Audio waveform |
---|---|
Noisy Sample | |
Denoised Result | |
Silent Intervals |
Results | Audio waveform |
---|---|
Noisy Sample | |
Denoised Result | |
Silent Intervals |
Results | Audio waveform |
---|---|
Noisy Sample | |
Denoised Result | |
Silent Intervals |
Lastly, listen to "The Sound of Silence" by our method that listens to the sounds of silence!
Results | Audio waveform |
---|---|
Song Excerpt | |
Denoised Result | |
Silent Intervals |
Results | Audio waveform |
---|---|
Song Excerpt | |
Denoised Result | |
Silent Intervals |