SPRAAK
|
Missing data techniques based acoustic modelling. More...
Functions | |
SprAcmod * | spr_am_mdt_read (const char *hmm_fname, const char *mvg_fname, const char *sel_fname) |
Variables | |
const char *const | str_mdf_method [] |
const char *const | str_mdt_mask [] |
const SprCmdOptDesc | spr_cwr_od_am_mdt [] |
const SprCwrSRMDesc | spr_cwr_srm_desc_am_mdt |
Missing data techniques based acoustic modelling.
Evaluation of Gaussians with MDT
Maximum Likelihood based channel estimation method in the MDT-based approach.
Expected input format: [mean(noisy_spec),noisy_spec,mask,delta1,mask_delta1,...,VAD,SPKR_ID(optional)]
Options when loading a MDT setup with cluster gaussians:
[AM_options] | ||
---|---|---|
cg=<String:val> | [required] | |
The set of cluster gaussians to load. | ||
exp=<String:val> | ||
The exponent file for the cluster gaussians. | ||
map=<String:val> | [required] | |
The file connecting the back-end gaussians to the cluster gaussians (via shortlists). | ||
mat=<String:val> | [required] | |
Matrix to transform the imputed features and the non-mdt features to back-end features. | ||
feedback=<String:val> | ||
Give feedback to the preprocessing, to the [feeback] module with the given name. | ||
Nspec=<int:val> | [required] | |
Number of spectral coefficients used for the prospect features. | ||
Ncep=<int:val> | [required] | |
Number of cepstral coefficients used for the prospect features (the count includes cep0). | ||
Ndelta=<int:val>(2) | ||
Total number of delta prospect streams (the first stream contains the static features). Additonal features outside the prospect static and delta features are allowed (e.g. delta cepstra features). Such features simply bypass the MDT imputation. | ||
MDTiter=<int:val>(2) | ||
Maximum number of iteration for the MDT imputation. | ||
MDTprune=<String:val> | ||
Pruning thresholds on the (partial) Mahalanobis distances. The order for mdt_iter=2, Ndelta=2 is as follow: [non-prospect feature elements, prospect-statics-iter0, prospect-statics-iter1, sum(non-prospect,static), prospect-delta1-iter0, prospect-delta1-iter1, sum(non-prospect,static,data), prospect-delta2-iter0, prospect-delta2-iter1]. By default, no pruning is applied. | ||
MDTmask=<BINARY/FUZZY>(BINARY) | ||
Mask method for the static features (the delta features always require a ternary mask). | ||
MDTdmin=<F32:val>(0.0) | ||
Lower bound on distances (imputation may make unreasonable good imputations). | ||
lambda=<F32:val>(0.0) | ||
Compensate for the scaling of the precision matrix during the evaluation of prospects in the fuzzy mask framework. | ||
MDFmethod=<NONE/PROS_GRAD>(NONE) | ||
Specify a channel estimation method. | ||
MDFnfr=<int:val>(1000) | ||
Minimum number of non-silence frames that must have been recognized before the channel estimate is made. | ||
MDFnoreset | ||
Do not reset the channel estimates when starting with a new file. | ||
MDFbackoff | ||
Update the globaly best custer gaussian if the state in the Viterbi path has no firing cluster gaussian. | ||
MDFprior | ||
Update with the best cluster gaussian as found in the best back-end gaussian's shortlist as target (by default, the cluster gaussian that provided the best matching observation for the best back-end gaussian is used). | ||
MDFmulti_spkr | ||
Cope with multiple speakers, i.e. have a channel estimate per speaker. | ||
CGlim_lst=<int:val> | ||
Further limit the length of the shortlists connecting the back-end gaussians to the cluster gaussians (see the 'map' option). | ||
CGsel_fac=<F32:val>(.2) | ||
Retain only the top <sel_fac>*100 percent best cluster gaussians, thus limiting the amount of candidates for feeding the back-end gaussians. | ||
CGpp_frac=<F32:val>(.95) | ||
Further limit the number of retained cluster gaussians based on the cumulative posterior probabilities, i.e. make the smallest selection that covers <pp_frac>*100 percent of the total probability mass. | ||
CGalpha=<F32:val>(.4) | ||
Raise the likelihood of the cluster gaussians to the power <alpha>, limiting its dynamic range. Works in combination with the 'CGpp_frac' option. | ||
CGbeta=<F32:val>(1.0) | ||
Raise the prior probability of the cluster gaussians to the power <beta>. |