Demo Cases

Below are some examples of synthesized audio corresponding to different scenarios and different schemes of Voice Clone attacks mentioned in the paper.

Watermark Fidelity

File NameLJ001-0001.wavLJ001-0002.wavLJ001-0003.wavLJ001-0004.wavLJ001-0005.wav
Original Audio
wm-1
wm-2

Voice Clone

Fastspeech2_tuned_Hifi-GAN_wm-1

File Name1.wav2.wav3.wav4.wav5.wav
Pretrained
Watermarked

Fastspeech2_tuned_Hifi-GAN_wm-2

File Name1.wav2.wav3.wav4.wav5.wav
Pretrained
Watermarked

Fastspeech2_pre-trained_Hifi-GAN

File Name1.wav2.wav3.wav4.wav5.wav
Pretrained
wm-1
wm-2

Fastspeech2_Griffin-Lim

File Name1.wav2.wav3.wav4.wav5.wav
Pretrained
wm-1
wm-2

Tacotron2_tuned_Hifi-GAN_wm-1

File Name1.wav2.wav3.wav4.wav5.wav
Pretrained
Watermarked

Tacotron2_tuned_Hifi-GAN_wm-2

File Name1.wav2.wav3.wav4.wav5.wav
Pretrained
Watermarked

Tacotron2_pre-trained_Hifi-GAN

File Name1.wav2.wav3.wav4.wav5.wav
Pretrained
wm-1
wm-2

Tacotron2_Griffin-Lim

File Name1.wav2.wav3.wav4.wav5.wav
Pretrained
wm-1
wm-2

VITS

File Name1.wav2.wav3.wav4.wav5.wav
Pretrained
wm-1
wm-2
Mp3 Compression 8 kbps
Low Pass Filtering 2 kHz
Harmful Combined
Resampling 16K
Mp3 Compression 64 kbps
Regular Combined
FSVC
RFDLM
FSVC Overwriting
RFDLM Overwriting
The Proposed Overwriting
The Proposed* Overwriting
Trained with Domain Loss
Mask Position2
Mask Position3

PaddleSpeech-English

File Name1.wav2.wav3.wav4.wav5.wav6.wav7.wav8.wav9.wav10.wav
p225
p226
p227
p228
p229
p230

PaddleSpeech-Chinese

File Name1.wav2.wav3.wav4.wav5.wav6.wav7.wav8.wav9.wav10.wav
D11
D12
D4
D6
D7
D8

Voice-Clone-App

File Name1.wav2.wav3.wav4.wav5.wav6.wav7.wav8.wav9.wav10.wav
p225
p226
p227
p228
p229
p230

so-vits-svc

Right Here Waiting
Converted

VAE Reconstruction

File NameLJ001-0001.wavLJ001-0002.wavLJ001-0003.wavLJ001-0004.wavLJ001-0005.wav
Watermarked
MelVAE
VAE of AudioLDM