SPRAAK
 All Data Structures Namespaces Files Functions Variables Typedefs Enumerations Enumerator Groups Pages
Introduction

The Wall Street Journal test suite has been one of the most widely used speech recognition benchmarks since the early 90's. This demo suite contains a configuration for training a state-of-the-art HMM based system for the WSJ benchmark. On the popular November 92 non-verbalized punctuation test set using a trigram language model an error rate of less than 8% is obtained.

Who should read this tutorial?

This tutorial is not written for first time users of SPRAAK and/or speech recognition. We advise first time users to go through the TIMIT tutorials first, as these introduce individual components and concepts of the SPRAAK package step by step. This tutorial assumes an understanding of the concepts underlying a large vocabulary speech recognition system. Experienced speech recognition researchers may give it a try to start here and dig further in the general manual pages or read selected pages from the TIMIT tutorials when needed.

What will you learn?

This tutorial shows all the steps in creating and evaluating a large vocabulary systen, including: – getting all the required resources in place and in the right format – training an acoustic model – preparing a language model – evaulating and fine tuning the final system

External Prerequisites
Included in this tutorial are
About The Wall Street Journal Benchmark

The WSJ benchmark was established in the early 90's as a means of driving the development and establishing a uniform evaluating methodology for large vocabulary speech recognition. The benchmark has mainly been used to evaluate the progress in acoustic modeling. While it is a popular and useful benchmark, one should also understand it's limitations. The "read speech" and "high quality recordings" properties make that these results or a system trained with only this data are not representative for real life applications.