Noisy speech database for training speech enhancement algorithms and TTS models

Date Available

2017-08-21

Type

sound

Data Creator

Valentini-Botinhao, Cassia

Publisher

University of Edinburgh. School of Informatics. Centre for Speech Technology Research (CSTR)

Relation (Is Referenced By)

http://www.research.ed.ac.uk/portal/en/publications/speech-enhancement-for-a-noiserobust-texttospeech-synthesis-system-using-deep-recurrent-neural-networks(08deb6fd-79c0-490f-ae46-f37034b6bfb4).html

Metadata

Show full item record

Altmetric

Citation

Valentini-Botinhao, Cassia. (2017). Noisy speech database for training speech enhancement algorithms and TTS models, 2016 [sound]. University of Edinburgh. School of Informatics. Centre for Speech Technology Research (CSTR). https://doi.org/10.7488/ds/2117.

Description

Clean and noisy parallel speech database. The database was designed to train and test speech enhancement methods that operate at 48kHz. A more detailed description can be found in the papers associated with the database. For the 28 speaker dataset, details can be found in: C. Valentini-Botinhao, X. Wang, S. Takaki & J. Yamagishi, "Speech Enhancement for a Noise-Robust Text-to-Speech Synthesis System using Deep Recurrent Neural Networks", In Proc. Interspeech 2016. For the 56 speaker dataset: C. Valentini-Botinhao, X. Wang, S. Takaki & J. Yamagishi, "Investigating RNN-based speech enhancement methods for noise-robust Text-to-Speech”, In Proc. SSW 2016. Some of the noises used to create the noisy speech were obtained from the Demand database, available here: http://parole.loria.fr/DEMAND/ . The speech database was obtained from the CSTR VCTK Corpus, available here: https://doi.org/10.7488/ds/1994. The speech-shaped and babble noise files that were used to create this dataset are available here: http://homepages.inf.ed.ac.uk/cvbotinh/se/noises/.