Construction of Large Scale Isolated Word Speech Corpus in Bangla

Authors

  • Md. Farukuzzaman Khan

Keywords:

Bangla, speech corpora, BDNC01, vocabulary, isolated word, speech recognition

Abstract

A new speech corpus of isolated words in Bangla language has been recorded including high frequent words from a text corpus BdNC01 It has been specifically designed for various research activities related to speaker-independent Bangla speech recognition The database consists of speech of 100 speakers each of them speaking 1081 words Another 50 new speakers were employed to speak all the list of speech to construct a test database Every utterance was repeated 5 times in different days to avoid time variation of speaker property The total 400 hours of recording makes the corpora largest in its type size and language domain This paper describes the motivation for the corpora and the processes undertaken in its construction The paper concludes with the usability of the corpus

How to Cite

Md. Farukuzzaman Khan. (2018). Construction of Large Scale Isolated Word Speech Corpus in Bangla. Global Journal of Computer Science and Technology, 18(G2), 21–26. Retrieved from https://computerresearch.org/index.php/computer/article/view/1690

Construction of Large Scale Isolated Word Speech Corpus in Bangla

Published

2018-05-15