SPRAAK
|
Impute a spectrum. More...
Functions | |
void | spr_spec_impute_free (SprSspInfo *Info) |
int | spr_spec_impute_setup (SprSspInfo *Info, const char **descript, void *aux_info) |
int | spr_spec_impute_process (SprSspInfo *Info, const void *frame_in, void *frame_out) |
Impute a spectrum.
Impute a MEL-spectrum given some input spectrum and a noise mean and a (fixed or variable) noise variance using a simplified parallel model combination techniques in the spectral domain using the missing data assumption (log(X^2+N^2)=max(log(X^2),log(N^2))). The input vector consist of the noisy speech spectrum, the estimated noise mean and optionally the estimated noise spread (sigma).
[spec_impute] | |
---|---|
codebook <filename> | |
Codebook filename (a .mvg file). | |
Nsil <N>(-1) | |
Indicate that there is a VAD input (last input element) and that only the first <N> codebook entries must be used to handle frames marked as silence; must be specified before the noise sigma's. | |
topN <N>(1) | |
Track the <N> best codebook entries. | |
pct <N> | |
Remove codebook entries from the <topN> list which have to low a probability. | |
ll_fac <alpha>(1.0) [beta](1.0) | |
Scale the log likelihoods with the given scale factors before deriving posterior probabilities. <alpha> scales the 'distance' while <beta> scales the prior log probabilities. | |
noise sigma <sig>(1.0) ... | |
Use a fixed noise spread (sigma); by default the variance is read from the input as well. | |
output <multi_candidate/spch_noise_hi_lo/spch_sigma_noise/spch_noise/hi_lo/hi/lo>(lo) [likelihood/void](void) | |
Specify how the <topN> imputed spectra are output. General remark: whenever an imputed spectral coefficient is larger than the input spectral coefficient, it is clipped to the value of the input. In <multi_candidate> mode, all inputed speech spectra are output, preceeded by their posterior probability. In <spch_noise> mode, the imputed speech and noise spectra are output. In <spch_sigma_noise> mode, the spread (sigma) on the imputed speech is output as well. In <hi_lo> mode, the minimum and maximum over the <topN> imputed speech spectra are output. If the second field is set to 'likelihood', the first element in the output vector will contain the total likelihood of the observation given the speech and noise model. |