SPRAAK
 All Data Structures Namespaces Files Functions Variables Typedefs Enumerations Enumerator Groups Pages
Functions | Variables
mdt_am.c File Reference

Missing data techniques based acoustic modelling. More...

Functions

SprAcmod * spr_am_mdt_read (const char *hmm_fname, const char *mvg_fname, const char *sel_fname)
 

Variables

const char *const str_mdf_method []
 
const char *const str_mdt_mask []
 
const SprCmdOptDesc spr_cwr_od_am_mdt []
 
const SprCwrSRMDesc spr_cwr_srm_desc_am_mdt
 

Detailed Description

Missing data techniques based acoustic modelling.

Evaluation of Gaussians with MDT

Maximum Likelihood based channel estimation method in the MDT-based approach.

Expected input format: [mean(noisy_spec),noisy_spec,mask,delta1,mask_delta1,...,VAD,SPKR_ID(optional)]

Options when loading a MDT setup with cluster gaussians:

[AM_options]
cg=<String:val>
[required]
The set of cluster gaussians to load.
exp=<String:val>
The exponent file for the cluster gaussians.
map=<String:val>
[required]
The file connecting the back-end gaussians to the cluster gaussians (via shortlists).
mat=<String:val>
[required]
Matrix to transform the imputed features and the non-mdt features to back-end features.
feedback=<String:val>
Give feedback to the preprocessing, to the [feeback] module with the given name.
Nspec=<int:val>
[required]
Number of spectral coefficients used for the prospect features.
Ncep=<int:val>
[required]
Number of cepstral coefficients used for the prospect features (the count includes cep0).
Ndelta=<int:val>(2)
Total number of delta prospect streams (the first stream contains the static features). Additonal features outside the prospect static and delta features are allowed (e.g. delta cepstra features). Such features simply bypass the MDT imputation.
MDTiter=<int:val>(2)
Maximum number of iteration for the MDT imputation.
MDTprune=<String:val>
Pruning thresholds on the (partial) Mahalanobis distances. The order for mdt_iter=2, Ndelta=2 is as follow: [non-prospect feature elements, prospect-statics-iter0, prospect-statics-iter1, sum(non-prospect,static), prospect-delta1-iter0, prospect-delta1-iter1, sum(non-prospect,static,data), prospect-delta2-iter0, prospect-delta2-iter1]. By default, no pruning is applied.
MDTmask=<BINARY/FUZZY>(BINARY)
Mask method for the static features (the delta features always require a ternary mask).
MDTdmin=<F32:val>(0.0)
Lower bound on distances (imputation may make unreasonable good imputations).
lambda=<F32:val>(0.0)
Compensate for the scaling of the precision matrix during the evaluation of prospects in the fuzzy mask framework.
MDFmethod=<NONE/PROS_GRAD>(NONE)
Specify a channel estimation method.
MDFnfr=<int:val>(1000)
Minimum number of non-silence frames that must have been recognized before the channel estimate is made.
MDFnoreset
Do not reset the channel estimates when starting with a new file.
MDFbackoff
Update the globaly best custer gaussian if the state in the Viterbi path has no firing cluster gaussian.
MDFprior
Update with the best cluster gaussian as found in the best back-end gaussian's shortlist as target (by default, the cluster gaussian that provided the best matching observation for the best back-end gaussian is used).
MDFmulti_spkr
Cope with multiple speakers, i.e. have a channel estimate per speaker.
CGlim_lst=<int:val>
Further limit the length of the shortlists connecting the back-end gaussians to the cluster gaussians (see the 'map' option).
CGsel_fac=<F32:val>(.2)
Retain only the top <sel_fac>*100 percent best cluster gaussians, thus limiting the amount of candidates for feeding the back-end gaussians.
CGpp_frac=<F32:val>(.95)
Further limit the number of retained cluster gaussians based on the cumulative posterior probabilities, i.e. make the smallest selection that covers <pp_frac>*100 percent of the total probability mass.
CGalpha=<F32:val>(.4)
Raise the likelihood of the cluster gaussians to the power <alpha>, limiting its dynamic range. Works in combination with the 'CGpp_frac' option.
CGbeta=<F32:val>(1.0)
Raise the prior probability of the cluster gaussians to the power <beta>.
Author
Maarten Van Segbroeck, Kris Demuynck
Date
10 March 2011
Revision History:
10/03/11 - KD
imported from a branch of HMM75
15/11/07 - MVS
fuzzy masks (HMM75)
20/06/06 - MVS
maximum Likelihood based channel estimation method in the MDT-based approach (HMM75)
03/12/05 - KD, HVh, MVS
modifications for MDT (HMM75)