Noisy speech database for training speech enhancement algorithms and TTS models
Date Available
2017-08-21Type
soundData Creator
Valentini-Botinhao, CassiaPublisher
University of Edinburgh. School of Informatics. Centre for Speech Technology Research (CSTR)Relation (Is Referenced By)
http://www.research.ed.ac.uk/portal/en/publications/speech-enhancement-for-a-noiserobust-texttospeech-synthesis-system-using-deep-recurrent-neural-networks(08deb6fd-79c0-490f-ae46-f37034b6bfb4).htmlMetadata
Show full item recordAltmetric
Citation
Valentini-Botinhao, Cassia. (2017). Noisy speech database for training speech enhancement algorithms and TTS models, 2016 [sound]. University of Edinburgh. School of Informatics. Centre for Speech Technology Research (CSTR). https://doi.org/10.7488/ds/2117.Description
Clean and noisy parallel speech database. The database was designed to train and test speech enhancement methods that operate at 48kHz. A more detailed description can be found in the papers associated with the database. For the 28 speaker dataset, details can be found in: C. Valentini-Botinhao, X. Wang, S. Takaki & J. Yamagishi, "Speech Enhancement for a Noise-Robust Text-to-Speech Synthesis System using Deep Recurrent Neural Networks", In Proc. Interspeech 2016. For the 56 speaker dataset: C. Valentini-Botinhao, X. Wang, S. Takaki & J. Yamagishi, "Investigating RNN-based speech enhancement methods for noise-robust Text-to-Speech”, In Proc. SSW 2016. Some of the noises used to create the noisy speech were obtained from the Demand database, available here: http://parole.loria.fr/DEMAND/ . The speech database was obtained from the CSTR VCTK Corpus, available here: https://doi.org/10.7488/ds/1994. The speech-shaped and babble noise files that were used to create this dataset are available here: http://homepages.inf.ed.ac.uk/cvbotinh/se/noises/.The following licence files are associated with this item:
Related items
Showing items related by title, author, creator and subject.
-
Acted clear speech corpus
Mayo, Catherine (LISTA Consortium: (i) Language and Speech Laboratory, Universidad del Pais Vasco, Spain and Ikerbasque, Spain; (ii) Centre for Speech Technology Research, University of Edinburgh, UK; (iii) KTH Royal Institute of Technology, Sweden; (iv) Institute of Computer Science, FORTH, Greece, 2013-09-24)Single male native British English talker recorded producing 25 TIMIT sentences in 5 conditions, two natural: (i) quiet, (ii) while the talker listened to high-intensity speech-shaped noise, and three acted: (i) as if to ... -
Manual and automatic labels for version 1.0 of UXTD, UXSSD, and UPX core data -- version 1.0
Eshky, Aciel; Ribeiro, Manuel Sam; Cleland, Joanne; Renals, Steve; Richmond, Korin; Roxburgh, Zoe; Scobbie, James; Wrench, AlanUltraSuite is a repository of ultrasound and acoustic data from child speech therapy sessions. The current release includes three data collections, one from typically developing children (UXTD) and two from children with ... -
UltraSuite Repository - sample data
Eshky, Aciel; Ribeiro, Manuel Sam; Cleland, Joanne; Renals, Steve; Richmond, Korin; Roxburgh, Zoe; Scobbie, James; Wrench, AlanUltraSuite is a repository of ultrasound and acoustic data from child speech therapy sessions. The current release includes three data collections, one from typically developing children -- Ultrax Typically Developing ...