HKUST/MTS: A Very Large Scale Mandarin Telephone Speech Corpus

Yi Liu²²,
Pascale Fung²²,
Yongsheng Yang²²,
Christopher Cieri²³,
Shudong Huang²³ &
…
David Graff²³

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 4274))

Included in the following conference series:

International Symposium on Chinese Spoken Language Processing

1827 Accesses
47 Citations

Abstract

The paper describes the design, collection, transcription and analysis of 200 hours of HKUST Mandarin Telephone Speech Corpus (HKUST/MTS) from over 2100 Mandarin speakers in mainland China under the DARPA EARS framework. The corpus includes speech data, transcriptions and speaker demographic information. The speech data include 1206 ten-minute natural Mandarin conversations between either strangers or friends. Each conversation focuses on a single topic. All calls are recorded over public telephone networks. All calls are manually annotated with standard Chinese characters (GBK) as well as specific mark-ups for spontaneous speech. A file with speaker demographic information is also provided. The corpus is the largest and first of its kind for Mandarin conversational telephone speech, providing abundant and diversified samples for Mandarin speech recognition and other application-dependent tasks, such as topic detection, information retrieval, keyword spotting, speaker recognition, etc. In a 2004 evaluation test by NIST, the corpus is found to improve system performance quite significantly.

HKUST Mandarin Telephone Transcripts, Part 1, Linguistic Data Consortium (LDC) catalog number LDC2005T32 and isbn 1-58563-352-6 http://www.ldc.upenn.edu/

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Using Forced Alignment for Phonetics Research

Lahjoita puhetta: a large-scale corpus of spoken Finnish with some benchmarks

Article Open access 09 August 2022

Evalita 2011: Automatic Speech Recognition Large Vocabulary Transcription

References

Linguistic Data Consortium (LDC), various corpus resources on http://www.ldc.upenn.edu
European Language Resources Association (ELRA), http://www.elra.info/
Hoge, H., et al.: European speech databases for telephone applications. In: Proceedings of the IEEE ICASSP, vol. 3, pp. 1771–1774 (1997)
Google Scholar
Ohtsuki, K., et al.: Japanese large-vocabulary continuous speech recognition using a newspaper corpus and broadcast news. Speech Communication 28, 155–166 (1999)
Article Google Scholar
Godfrey, J., et al.: SWITCHBOARD: Telephone Speech Corpus for Research and Development. In: Proceedings of the IEEE ICASSP, vol. 1, pp. 517–520 (1992)
Google Scholar
Wang, H.C.: MAT- A project to collect Mandarin speech data through telephone networks in Taiwan. Computational Linguistics and Chinese Language Processing 2(1), 73–90 (1997)
Google Scholar
Huang, J.H.: Chinese Dialects. Xiamen University Press (1987) (Chinese version)
Google Scholar
Lee, T., et al.: Spoken language resources for Cantonese speech processing. Speech Communication 36(3-4), 327–342 (2002)
Article MATH Google Scholar
LDC EARS Project RT-04 Transcription Guidelines: http://www.ldc.upenn.edu/Projects/Transcription/rt-04/RT-04-guidelines-V3.1.pdf
TextGrid as an objection of PRAAT: http://www.fon.hum.uva.nl/praat/manual/TextGrid.html
Le, A., et al.: 2004 fall rich transcription speech-to-text evaluation (2004), http://www.nist.gov/speech/tests/rt/

Download references

Author information

Authors and Affiliations

Human Language Technology Center, Department of Electronic and Computer Engineering, University of Science and Technology, Hong Kong
Yi Liu, Pascale Fung & Yongsheng Yang
Linguistic Data Consortium, University of Pennsylvania, U.S.A.
Christopher Cieri, Shudong Huang & David Graff

Authors

Yi Liu
View author publications
You can also search for this author in PubMed Google Scholar
Pascale Fung
View author publications
You can also search for this author in PubMed Google Scholar
Yongsheng Yang
View author publications
You can also search for this author in PubMed Google Scholar
Christopher Cieri
View author publications
You can also search for this author in PubMed Google Scholar
Shudong Huang
View author publications
You can also search for this author in PubMed Google Scholar
David Graff
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Computer Science, The University of Hong Kong, Hong Kong
Qiang Huo
Human Language Technology Department, Institute for Infocomm Research (I2R), 119613, Singapore
Bin Ma
School of Computer Engineering, Nanyang Technological University (NTU), 639798, Singapore
Eng-Siong Chng
Institute for Infocomm Research, 21 Heng Mui Keng Terrace, 119613, Singapore
Haizhou Li

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Liu, Y., Fung, P., Yang, Y., Cieri, C., Huang, S., Graff, D. (2006). HKUST/MTS: A Very Large Scale Mandarin Telephone Speech Corpus . In: Huo, Q., Ma, B., Chng, ES., Li, H. (eds) Chinese Spoken Language Processing. ISCSLP 2006. Lecture Notes in Computer Science(), vol 4274. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11939993_73

Download citation

DOI: https://doi.org/10.1007/11939993_73
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-49665-6
Online ISBN: 978-3-540-49666-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

HKUST/MTS: A Very Large Scale Mandarin Telephone Speech Corpus

Abstract

Access this chapter

Subscribe and save

Buy Now

Preview

Similar content being viewed by others

Using Forced Alignment for Phonetics Research

Lahjoita puhetta: a large-scale corpus of spoken Finnish with some benchmarks

Evalita 2011: Automatic Speech Recognition Large Vocabulary Transcription

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

HKUST/MTS: A Very Large Scale Mandarin Telephone Speech Corpus

Abstract

Access this chapter

Subscribe and save

Buy Now

Preview

Similar content being viewed by others

Using Forced Alignment for Phonetics Research

Lahjoita puhetta: a large-scale corpus of spoken Finnish with some benchmarks

Evalita 2011: Automatic Speech Recognition Large Vocabulary Transcription

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation