Contents

The parameters describing an HMM, i.e. the transition and observation probabilities, are stored in a set of files.

The parameters for the observation probabilities are organized in two layers:

a set for the basis functions (shared over all states, typically multivariate gaussians)
a set for the weights linking the basis functions to the states (typically mixture weights)

As SPRAAK by default shares all basis functions over all states, the weight matrix tends to be very sparse (i.e. most weights are '0'). In order to accomodate a flexible sparse weight matrix, the full information is stored in 3 separate files (typically with the same basename):

acmod.mvg The parameters of the basis functions. For continuous density gaussians this is a set of multivariate gaussians. (For discrete densities it is a set of codebooks stored in a .cdb-file )
acmod.hmm The (non-zero) weights
acmod.sel The indices of the weights in the .hmm file

Obviously the above data concept easily accomodates commonly used extremes of tying:

untied gaussians in which gaussians are private to a state (in which case the .sel file is not required, as long as the same number of gaussians is used for all states)
fully tied gaussians in which for each state a weight is assigned to each gaussian

However, we may point out here that while SPRAAK will function correctly for all types of tying, there are a number of (computational) optimizations from the viewpoint of a tied system, which may not work as efficiently for an untied system.

Keys

Many of the keys are identical for the three types of files:

NMVG : number of multivariate gaussians
VLEN : dimension of the feature vector
DATA : {MVGAUSS, HMM, SELECT} respectively for multivariate gaussians, mixture weights and indices
NSTATE : number of states in an HMM
DENSTYPE: {SC_HMM, CD_HMM, DD_HMM} for respectively semi-continuous, continuous and discrete densities

Data

The MVGAUSS file contains one vector for each gaussian containing:

<COUNT>  <MEANS> <SIGMA's>

<COUNT> : the observation mass for the current gaussian during the last instance of training
<MEANS> : an array with the mean values of the current gaussian
<SIGMA's>: an array with the sigma values of the current gaussian

The HMM file contains for each unique state a variable length vector with following information:

<K> <COUNT> <UNDEF> <TR_PROB_0> <TR_PROB_1> <WGHT_0> <WGHT_1> ... <WGHT_K-1>

<K> the number of non-zero mixture weights for the current state
<COUNT> the observation mass for the current state during the last phase of training
<UNDEF> undefined, free field
<TR_PROB_0/1> transition probability to current/next state
<WGHT_#> mixture weights for the current state

The SELECT file contains the indices of the gaussians for the weights in a .hmm file.

<K> <INDX_0> <INDX_1> <INDX_2> ...

<K> the number of non-zero mixture weights for the current state
<INDX_#> the mixture indices

Note that:

The sum of the <COUNT>'s will be equal to the number of observations during training (of a gaussian in MVGAUSS, of a state in HMM)
The <COUNT>'s are easily converted to weights by dividing by the sum of <COUNT>'s.
<COUNT> is stored instead of normalized counts because it retains more information with the eye on adaptation
The content of a SELECT file must obviously be in strict correspondance with an HMM file.
Indices and weights are unsorted.

Example

The example files, of which small parts are printed, constitute a moderately complex model with

a feature vectore of size 39
10.365 gaussians
576 states sharing the 10.365 gaussians
2131 context dependent phones sharing the 576 states
68.560 non-zero weights of the possible 6 million(=576x10.365) weights, i.e. roughly 1%
on average 122(=68.560/576) weights per state (183,65,189, ...)

.spr
DATA            MVGAUSS
TYPE            F32
LAYOUT          MATRIX
DIM1            10365
DIM2            79
VLEN            39
NMVG            10365
EXTENDED        PARAMSET
#
165.852 1.39298 -4.01403 -0.498516 2.56224 -1.63849 ...
136.745 0.387589 -2.55462 0.00217978 3.6786 -2.0045 ...
251.085 -0.175839 -1.89165 -0.048543 3.61929 ...
...
[File:  acmod.mvg]

.spr
DATA            HMM
DENSTYPE        SC_HMM
TYPE            F32
LAYOUT          MATRIX
DIM1            71440
DIM2            1
NSTATE          576
NMVG            10365
NUNIT           2131
UNIT_FILE       acmod.arcd
MVG_FILE        acmod.mvg
SELECT_FILE     acmod.sel
#
183.0 5237.0 0.0 -0.720823 -0.0916143 -1.23773 -1.31281 -1.34646 -1.43873 -1.41509 -1.402 ...
65.0 2179.0 0.0 -0.980322 -0.048 -1.01605 -1.05985 -1.03075 -1.07108 -1.0951 -1.11228 ..
189.0 4746.0 0.0 -0.612619 -0.121475 -1.32614 -1.32412 -1.38805 -1.44201 ...
...
[File:  acmod.hmm]

.spr
DATA            SELECT
DENSTYPE        SC_HMM
TOTAL_MIXDIM    68560
TYPE            U16
DIM1            69136
DIM2            1
NSTATE          576
NMVG            10365
#
183 15 19 6 12 17 7 2 14 20 11 25 ...  9385 3057 9384 83 4070
65 36 30 33 35 29 32 ...  9 170 1720 141 7538 212 37 379
189 54 47 52 53 45 40 ...
...
[File:  acmod.sel]

Tools

A whole set of tools is available for initialization, estimation and modification of HMM parameters.

HMM parameter estimation tools:

spr_segpass.c : estimating counts or a model from a segmentation
spr_vitpass.c : runs a single Viterbi pass, outputting counts or new models
spr_ct2hmm.c : compute HMM parameters from counts
spr_calc_oprobs.c : calculate observation probabilities

Postprocessing tools:

spr_hmmsmooth.c : smoothing with uniform statistics"
spr_hmmcheck.c : verifies consistency of hmm probabilities
spr_hmmnormalize.c : re-normalizes parameters
spr_setminprob.c : sets a floor on the probabilities
spr_hmmshake.c : changes the sequence of storagE
spr_mktopology.c : changes topology
spr_mkgarbage.c : creates a garbage model
spr_initph2wm.c : initialize a word model from phone models
spr_hmmmerge.c : combine 2 HMMs containing different units
spr_hmmsum.c : combine 2 HMMs trained on subsections of data
spr_selgauss.c : reduce the number of gaussians being used
spr_addgauss.c : create fully tied gaussians
spr_hmmprint.c : print an HMM file
spr_hmmd2c.c : convert from discrete to continuous densities
spr_hmmc2d.c : convert from continuous to discrete densities

A separate tool is provide to convert HTK files to the SPRAAK format in spr_htk2key.c

Internal Data Structures

The above files are close mappings of the data structures used in the Acoustic Model Object.

Bugs and Limitations

Observation Probabilities The fileformats are not limited to the storage of a mixture of gaussians and can easily accomodate other types of basis functions or mixing paradigms.
- However, the functionality of the programs working with these files has only been extensively tested for sparse mixtures of gaussians.
- Discrete density models and all related programs and program options are provided 'as is'.
These worked fine in older versions of the HMM package but were not rigorously tested on a recent version of SPRAAK.
Topologies The HMM system was originally developed to support a number of different topologies, including LEFT_TO_RIGHT, BAKIS, FENONIC (include null arcs), .. This is reflected in the keys TOPOLOGY (which topology) and TPDIM (number of parameters required to describe the transition probabilities) required to describe the transition probabilities). However, in the current version only the LEFT_TO_RIGHT topology is guaranteed to function as expected.