- Contents
The parameters describing an HMM, i.e. the transition and observation probabilities, are stored in a set of files.
The parameters for the observation probabilities are organized in two layers:
-
a set for the basis functions (shared over all states, typically multivariate gaussians)
-
a set for the weights linking the basis functions to the states (typically mixture weights)
As SPRAAK by default shares all basis functions over all states, the weight matrix tends to be very sparse (i.e. most weights are '0'). In order to accomodate a flexible sparse weight matrix, the full information is stored in 3 separate files (typically with the same basename):
-
acmod.mvg The parameters of the basis functions. For continuous density gaussians this is a set of multivariate gaussians. (For discrete densities it is a set of codebooks stored in a .cdb-file )
-
acmod.hmm The (non-zero) weights
-
acmod.sel The indices of the weights in the .hmm file
Obviously the above data concept easily accomodates commonly used extremes of tying:
-
untied gaussians in which gaussians are private to a state (in which case the .sel file is not required, as long as the same number of gaussians is used for all states)
-
fully tied gaussians in which for each state a weight is assigned to each gaussian
However, we may point out here that while SPRAAK will function correctly for all types of tying, there are a number of (computational) optimizations from the viewpoint of a tied system, which may not work as efficiently for an untied system.
- Keys
Many of the keys are identical for the three types of files:
-
NMVG : number of multivariate gaussians
-
VLEN : dimension of the feature vector
-
DATA : {MVGAUSS, HMM, SELECT} respectively for multivariate gaussians, mixture weights and indices
-
NSTATE : number of states in an HMM
-
DENSTYPE: {SC_HMM, CD_HMM, DD_HMM} for respectively semi-continuous, continuous and discrete densities
- Data
The MVGAUSS file contains one vector for each gaussian containing:
<COUNT> <MEANS> <SIGMA's>
-
<COUNT> : the observation mass for the current gaussian during the last instance of training
-
<MEANS> : an array with the mean values of the current gaussian
-
<SIGMA's>: an array with the sigma values of the current gaussian
The HMM file contains for each unique state a variable length vector with following information:
<K> <COUNT> <UNDEF> <TR_PROB_0> <TR_PROB_1> <WGHT_0> <WGHT_1> ... <WGHT_K-1>
-
<K> the number of non-zero mixture weights for the current state
-
<COUNT> the observation mass for the current state during the last phase of training
-
<UNDEF> undefined, free field
-
<TR_PROB_0/1> transition probability to current/next state
-
<WGHT_#> mixture weights for the current state
The SELECT file contains the indices of the gaussians for the weights in a .hmm file.
<K> <INDX_0> <INDX_1> <INDX_2> ...
-
<K> the number of non-zero mixture weights for the current state
-
<INDX_#> the mixture indices
Note that:
-
The sum of the <COUNT>'s will be equal to the number of observations during training (of a gaussian in MVGAUSS, of a state in HMM)
-
The <COUNT>'s are easily converted to weights by dividing by the sum of <COUNT>'s.
-
<COUNT> is stored instead of normalized counts because it retains more information with the eye on adaptation
-
The content of a SELECT file must obviously be in strict correspondance with an HMM file.
-
Indices and weights are unsorted.
- Example
The example files, of which small parts are printed, constitute a moderately complex model with
-
a feature vectore of size 39
-
10.365 gaussians
-
576 states sharing the 10.365 gaussians
-
2131 context dependent phones sharing the 576 states
-
68.560 non-zero weights of the possible 6 million(=576x10.365) weights, i.e. roughly 1%
-
on average 122(=68.560/576) weights per state (183,65,189, ...)
.spr
DATA MVGAUSS
TYPE F32
LAYOUT MATRIX
DIM1 10365
DIM2 79
VLEN 39
NMVG 10365
EXTENDED PARAMSET
#
165.852 1.39298 -4.01403 -0.498516 2.56224 -1.63849 ...
136.745 0.387589 -2.55462 0.00217978 3.6786 -2.0045 ...
251.085 -0.175839 -1.89165 -0.048543 3.61929 ...
...
[File: acmod.mvg]
.spr
DATA HMM
DENSTYPE SC_HMM
TYPE F32
LAYOUT MATRIX
DIM1 71440
DIM2 1
NSTATE 576
NMVG 10365
NUNIT 2131
UNIT_FILE acmod.arcd
MVG_FILE acmod.mvg
SELECT_FILE acmod.sel
#
183.0 5237.0 0.0 -0.720823 -0.0916143 -1.23773 -1.31281 -1.34646 -1.43873 -1.41509 -1.402 ...
65.0 2179.0 0.0 -0.980322 -0.048 -1.01605 -1.05985 -1.03075 -1.07108 -1.0951 -1.11228 ..
189.0 4746.0 0.0 -0.612619 -0.121475 -1.32614 -1.32412 -1.38805 -1.44201 ...
...
[File: acmod.hmm]
.spr
DATA SELECT
DENSTYPE SC_HMM
TOTAL_MIXDIM 68560
TYPE U16
DIM1 69136
DIM2 1
NSTATE 576
NMVG 10365
#
183 15 19 6 12 17 7 2 14 20 11 25 ... 9385 3057 9384 83 4070
65 36 30 33 35 29 32 ... 9 170 1720 141 7538 212 37 379
189 54 47 52 53 45 40 ...
...
[File: acmod.sel]
- Tools
A whole set of tools is available for initialization, estimation and modification of HMM parameters.
HMM parameter estimation tools:
Postprocessing tools:
- spr_hmmsmooth.c : smoothing with uniform statistics"
- spr_hmmcheck.c : verifies consistency of hmm probabilities
- spr_hmmnormalize.c : re-normalizes parameters
- spr_setminprob.c : sets a floor on the probabilities
- spr_hmmshake.c : changes the sequence of storagE
- spr_mktopology.c : changes topology
- spr_mkgarbage.c : creates a garbage model
- spr_initph2wm.c : initialize a word model from phone models
- spr_hmmmerge.c : combine 2 HMMs containing different units
- spr_hmmsum.c : combine 2 HMMs trained on subsections of data
- spr_selgauss.c : reduce the number of gaussians being used
- spr_addgauss.c : create fully tied gaussians
- spr_hmmprint.c : print an HMM file
- spr_hmmd2c.c : convert from discrete to continuous densities
- spr_hmmc2d.c : convert from continuous to discrete densities
A separate tool is provide to convert HTK files to the SPRAAK format in spr_htk2key.c
- Internal Data Structures
The above files are close mappings of the data structures used in the Acoustic Model Object.
- Bugs and Limitations
- Observation Probabilities The fileformats are not limited to the storage of a mixture of gaussians and can easily accomodate other types of basis functions or mixing paradigms.
-
However, the functionality of the programs working with these files has only been extensively tested for sparse mixtures of gaussians.
-
Discrete density models and all related programs and program options are provided 'as is'.
These worked fine in older versions of the HMM package but were not rigorously tested on a recent version of SPRAAK.
- Topologies The HMM system was originally developed to support a number of different topologies, including LEFT_TO_RIGHT, BAKIS, FENONIC (include null arcs), .. This is reflected in the keys TOPOLOGY (which topology) and TPDIM (number of parameters required to describe the transition probabilities) required to describe the transition probabilities). However, in the current version only the LEFT_TO_RIGHT topology is guaranteed to function as expected.