Development of Assamese Speech Corpus and Automatic Transcription Using HTK

Himangshu Sarma⁵,
Navanath Saharia⁵ &
Utpal Sharma⁵

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 264))

Abstract

Exact pronunciation of words of a language is not found from the written form of the language. Phonetic transcription is a step towards the speech processing of a language. For a language like Assamese it is most important because it is spoken differently in different regions of the state. In this paper we report automatic transcription of Assamese speech using Hidden Markov Model Tool Kit (HTK). We obtain accuracy of 65.26 an experiment. We transcribed recorded speech files using IPA symbols and ASCII for automatic transcription. We used 34 phones for IPA transcription and 38 for ASCII transcription.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 169.00; Price excludes VAT (USA)

Softcover Book: USD 219.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Building HMM Independent Isolated Speech Recognizer System for Amazigh Language

Using Gaussian Mixtures on Triphone Acoustic Modelling-Based Punjabi Continuous Speech Recognition

Speech Recognition in Indian Languages—A Survey

References

Chang, S., Shastri, L., Greenberg, S.: Automatic phonetic transcription of spontaneous speech (american English). In: Proceedings of the INTERSPEECH, Beijing, China, pp. 330–333 (2000)
Google Scholar
Coxhead, P.: Phones and Phonemes (2007)
Google Scholar
Giurgiu, M., Kabir, A.: Automatic transcription and speech recognition of Romanian corpus RO-GRID. In: Proceedings of the 35th International Conference on Telecommunications and Signal Processing, Czech Republic, pp. 465–468 (2012)
Google Scholar
Hasan, M.R., Jamil, M., Rabbani, M.G., Rahman, M.S.: Speaker identification using Mel frequency cepstral coefficients. Variations 1, 4 (2004)
Google Scholar
Ladefoged, P.: Elements of acoustic phonetics. University of Chicago Press (1995)
Google Scholar
Ladefoged, P., Johnstone, K.: A course in phonetics (2011), CengageBrain.com
Laurent, A., Merlin, T., Meignier, S., Esteve, Y., Deléglise, P.: Iterative filtering of phonetic transcriptions of proper nouns. In: Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing, Taiwan, pp. 4265–4268 (2009)
Google Scholar
Leung, H., Zue, V.: A procedure for automatic alignment of phonetic transcriptions with continuous speech. In: Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing, San Diego, USA, vol. 9, pp. 73–76 (1984)
Google Scholar
Levinson, S.E., Liberman, M.Y., Ljolje, A., Miller, L.: Speaker independent phonetic transcription of fluent speech for large vocabulary speech recognition. In: Proceedings of the International Conference on Acoustics, Speech, and Signal Processing, Glasgow, Scotland, pp. 441–444 (1989)
Google Scholar
Liang, M.S., Lyu, R.Y., Chiang, Y.C.: Phonetic transcription using speech recognition technique considering variations in pronunciation. In: Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing, Honolulu, Hawaii, vol. 4, pp. 109–112 (2007)
Google Scholar
Nagarajan, T., Murthy, H.A., Hemalatha, N.: Automatic segmentation and labeling of continuous speech without bootstrapping. In: Proceedings of EUSIPCO, Vienna, Austria, pp. 561–564 (2004)
Google Scholar
Patil, H.A., Madhavi, M.C., Malde, K.D., Vachhani, B.B.: Phonetic Transcription of Fricatives and Plosives for Gujarati and Marathi Languages. In: Proceedings of International Conference on Asian Language Processing, Hanoi, Vietnam, pp. 177–180 (2012)
Google Scholar
Sarada, G.L., Lakshmi, A., Murthy, H.A., Nagarajan, T.: Automatic transcription of continuous speech into syllable-like units for Indian languages. Sadhana 34(2), 221–233 (2009)
Article Google Scholar
Sarma, H., Saharia, N., Sharma, U., Sinha, S.K., Malakar, M.J.: Development and transcription of Assamese speech corpus. In: Proceedings of National seminar cum Conference on Recent threads and Techniques in Computer Sciences, Bodoland University, India (2013)
Google Scholar
Sharma, U., Kalita, J.K., Das, R.K.: Acquisition of morphology of an Indic language from text corpus. ACM Transactions on Asian Language Information Processing 7(3), 9:1–9:33 (2008)
Google Scholar
Sjölander, K., Beskow, J.: Wavesurfer-an open source speech tool. In: Proceedings of ICSLP, Beijing, China, vol. 4, pp. 464–467 (2000)
Google Scholar
Stefan-Adrian, T., Doru-Petru, M.: Rule-based automatic phonetic transcription for the Romanian language. In: Proceedings of the Computation World: Future Computing, Service Computation, Cognitive, Adaptive, Content, Patterns, Athens, pp. 682–686 (2009)
Google Scholar
Wells, J.C.: Phonetic transcription and analysis. Encyclopedia of Language and Linguistics, pp. 386–396. Elsevier, Amsterdam (2006)
Google Scholar
Young, S., Evermann, G., Gales, M., Hain, T., Kershaw, D., Liu, X., Moore, G., Odell, J., Ollason, D., Povey, D., et al.: The HTK book, version, 3.4th edn. Cambridge University Engineering Department (2009)
Google Scholar

Download references

Author information

Authors and Affiliations

Tezpur University, Napaam, Assam, India, 784028
Himangshu Sarma, Navanath Saharia & Utpal Sharma

Authors

Himangshu Sarma
View author publications
You can also search for this author in PubMed Google Scholar
Navanath Saharia
View author publications
You can also search for this author in PubMed Google Scholar
Utpal Sharma
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Himangshu Sarma .

Editor information

Editors and Affiliations

Technopark Campus, Indian Institute of Information Technology and Management – Kerala (IIITM-K), Trivandrum, Kerala, India
Sabu M. Thampi
Center for Computing Research, National Polytechnic Institute, Mexico City, Mexico
Alexander Gelbukh
Department of Computer Science and Engineering, Indian Institute of Technology, Kharagpur, India
Jayanta Mukhopadhyay

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Sarma, H., Saharia, N., Sharma, U. (2014). Development of Assamese Speech Corpus and Automatic Transcription Using HTK. In: Thampi, S., Gelbukh, A., Mukhopadhyay, J. (eds) Advances in Signal Processing and Intelligent Recognition Systems. Advances in Intelligent Systems and Computing, vol 264. Springer, Cham. https://doi.org/10.1007/978-3-319-04960-1_11

Download citation

DOI: https://doi.org/10.1007/978-3-319-04960-1_11
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-04959-5
Online ISBN: 978-3-319-04960-1
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics

Development of Assamese Speech Corpus and Automatic Transcription Using HTK

Abstract

Access this chapter

Subscribe and save

Buy Now

Preview

Similar content being viewed by others

Building HMM Independent Isolated Speech Recognizer System for Amazigh Language

Using Gaussian Mixtures on Triphone Acoustic Modelling-Based Punjabi Continuous Speech Recognition

Speech Recognition in Indian Languages—A Survey

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

Development of Assamese Speech Corpus and Automatic Transcription Using HTK

Abstract

Access this chapter

Subscribe and save

Buy Now

Preview

Similar content being viewed by others

Building HMM Independent Isolated Speech Recognizer System for Amazigh Language

Using Gaussian Mixtures on Triphone Acoustic Modelling-Based Punjabi Continuous Speech Recognition

Speech Recognition in Indian Languages—A Survey

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation