SPRAAK
 All Data Structures Namespaces Files Functions Variables Typedefs Enumerations Enumerator Groups Pages
Example on the Aurora 4 database

This page explains the example code in SPRAAK/examples/exp_aur4/. While the examples in the previous section deal with the Speecon data, which cannot be included with this distribution, Aurora-4 can be purchased at a lower cost, such that this example should be reconstructable for most users. Aurora-4 is derived from the WSJ0-corpus by adding noise. The new data is provided with Aurora-4, but the language model and lexicon are not. Hence, the WSJ0-database is required as well.

Running the Aurora-4 experiments requires the following steps

cd examples/exp_wsj
MAKE_RESOURCES # adjust the paths in the MAKE_RESOURCES script to point to your copy of the WSJ0 DBase
cd ../exp_aur4
MAKE_RESOURCES # adjust the paths in the MAKE_RESOURCES script to point to your copy of the Aurora-4 DBase
RUN_EXPERIMENTS_MDT

The MAKE_RESOURCES scripts convert index files to corpora, download the CMU-dictionary and extract the words required for training and testing, convert language model to the SPRAAK binary format and so on.

The exp_aur4/RUN_EXPERIMENTS_MDT script performs the following steps:

  1. Create an initial segmentation file using a small acoustic model included in the SPRAAK distribution.
  2. Train a standard acoustic model on the 'clean' (noise-free) Aurora-4 train data. This model uses VAD to estimate the channel (meannorm) on speech only. This mode of operation matches the MDT channel estimate which also ignores silence frames in its channel estimate.
  3. Evaluate the standard (non-MDT) model. Begin/end point detection is added as a simple method to be somewhat noise robust. This result servers as the baseline.
  4. Create all resources needed for the MDT-setup:
    • the stream exponents and the prospect transformation matrix
    • the VQ codebooks for the mask estimation
    • the cluster Gaussians
    • the association table (Gaussian short-lists)
  5. Evaluate the MDT-setup using both static masks only and static and delta masks.

Some practical notes:

Results:

Robust features:

[1]
Kris Demuynck, Xueru Zhang, Dirk Van Compernolle and Hugo Van hamme. Feature versus Model Based Noise Robustness. In Proc. INTERSPEECH, pages 721–724, Makuhari, Japan, September 2010.