SPRAAK
 All Data Structures Namespaces Files Functions Variables Typedefs Enumerations Enumerator Groups Pages
spr_calc_oprobs.c File Reference

Dump state output probs. More...

Detailed Description

Dump state output probs.

Dump the transition and output probs. The first block contains the transition probabilities (preceeded by a comment line and ended by an empty line), one line per state giving the following two values:

log10(P(jump_out)) log10(P(self_loop)/P(jump_out))

Next, for every entry in the corpus file, you'll find a block of data consisting of (1) a comment line listing the file name and frame selection from the corpus file, and (2) a nfrx(nstate+2) matrix containing

spr_calc_oprobs [-cvt format](LOG10;NORM;) [-o output file](stdout) <-c Corpus>
    [-range b:e](0:-1) [-ssp script](SPR_BSS_DEV_NULL) [-obs ObsDir] [-suffix FileSuffix](sam)
    [-arcd FileName] <-h FileName> [-g FileName] [-sel FileName] [-am_opt Options]
    [-top_n Number(s)](0) [-rmg rmg_params](no) [-LMout Value](-100) [-NOGS](flag: no gauss sel)
Parameters
-cvt<em>format</em><aname="spr_calc_oprobs.cvt" class="el">
Configure the format of the acoustic likelhoods (LOG/LIN/LOG10, ...) – see spr_am_flags_od for more details.
-o<em>outputfile
Write output to this file. The file will contain ASCII data. The first block contains the transition probabilities (preceeded by a comment line and ended by an empty line), one line per state giving the following two values: log10(P(jump_out)) and log10(P(self_loop)/P(jump_out)). Next, for every entry in the corpus file, you'll find a block of data consisting of (1) a comment line listing the file name and frame selection from the corpus file, and (2) a <nfr>x(<nstate>+2) matrix containing
  • the frame index (first column),
  • the normalized (if the -NONORM option is not given) state likelihoods (log10 values) for that frame (normalized means divided by the estimated frame likelihood), and
  • the estimated frame likelihood (log10 value; assumes unigram a priori state probabilities; used to normalized the state likelihoods if the -NONORM option is not given; listed in the last column).
-c<em>Corpus</em><aname="spr_calc_oprobs.c" class="el">
File with corpus entries or segmentations.
-range<em>b:e</em><aname="spr_calc_oprobs.range" class="el">
Optional begin and end entry the corpus/segmentation file. Counting starts at 0.
-ssp<em>script</em><aname="spr_calc_oprobs.ssp" class="el">
The signal processing script used to preprocess the input data.
-obs<em>ObsDir</em><aname="spr_calc_oprobs.obs" class="el">
Observation directory name.
-suffix<em>FileSuffix</em><aname="spr_calc_oprobs.suffix" class="el">
File suffix of the observation files (without leading '.').
-arcd<em>FileName</em><aname="spr_calc_oprobs.arcd" class="el">
Unit file name (.arcd or .cd format).
-h<em>FileName</em><aname="spr_calc_oprobs.h" class="el">
The input HMM file.
-g<em>FileName</em><aname="spr_calc_oprobs.g" class="el">
The input MVG file (gaussians).
-sel<em>FileName</em><aname="spr_calc_oprobs.sel" class="el">
The input select file name (tied gaussian).
-am_opt<em>Options</em><aname="spr_calc_oprobs.am_opt" class="el">
Extra options for loading the acoustic model. A non-default acoustic model can be selected by having '=<am_type>;' as first option. See cwr_am_tbl.c for a list of acoustic models available.
-top_n<em>Number(s)</em><aname="spr_calc_oprobs.top_n" class="el">
Only take the top-N gaussians into account when calculating output probabilities. If one value is given, it is used for all mixtures. Else a value per mixture must be given, separated by commas. Use '0' to set top_n to the number of gaussians in the mixture.
-rmg<em>rmg_params</em><aname="spr_calc_oprobs.rmg" class="el">
The parameters for the quick selection of gaussians. If one value is given, it is used for all mixtures. Else a value per mixture must be given, separated by commas. Use 'no' if no quick selection is wanted. See rm_gauss.c for a description of the parameters.
-LMout<em>Value</em><aname="spr_calc_oprobs.LMout" class="el">
Floor the state likelihoods of an observation using a fraction of the unconditional likelihood of the observation (weighted sum of the state likelihoods). Practically necessary if only few gaussians are evalutated (-top_n or -rmg options). The value given offset an automatically determined log10(fraction). Use -100 to turn the flooring off, and 0.0 to use the default.automatically.
-NOGSflagno gauss sel
Forgo the (sentence level) lexicon based Gaussian selection. The lexicon based Gaussian selection speeds up the decoding but may interfere with score normalization techniques that assume all Gaussians were evaluated.

Dump state output probs.

note
Specify '/' as corpus file to calculate the probabilities for a single file only; use the -obs option to specify the file.
Author
Kris Demuynck
Date
20/09/05