SPRAAK
 All Data Structures Namespaces Files Functions Variables Typedefs Enumerations Enumerator Groups Pages
Acoustic Unit File
Contents

Unit files contain lists of 'acoustic units' with additional information such as number of states, state numbers, context dependency .. In the SPRAAK hierarchy units are the layer between words and states: i.e.

'Acoustic units' are hence not necessarily phones, but can be any subword unit.

Unit files of different complexity appear at different moments in the design of a speech recognition system. Typically two types are distinguished:

The file extensions '.ci' and '.cd' may be a bit confusing as the '.cd' file is needed with context-independent phone models as well. Moreover, older versions till SPRAAK v0.9 also required a '.arcd' file; see legacy for more details

Data

Each line in a Context Dependent UNIT file contains following information:

<unitname> <transcription> <nstates> [state numbers]

in which

Example1: Context Independent Unit File

The first example shows a context independent unit file, i.e. typically this is the definition of the phonetic alphabet.

.spr
DATA    UNIT
DIM1    43
UNIT_TYPE       CI phonemes
LANGUAGE        English
#
i
I
I!
E
{
@
@!
}
}!
u
U
O
A
e+I
a+U
a+I
...
d+Z
#
Example2: Context Dependent Unit File
.spr
DIM1    2131
DATA    UNIT
#
@0      [d]-@-[sz]              3       0 118 198
@1      [d]-@-[r]               3       0 120 197
@2      [d]-@-[l]               3       0 122 196
.........
s1572   [mN]-s-[SZ]             3       443 461 433
s1573   [mN]-s-[n]              3       443 461 434
s1574   [mN]-s-[l]              3       443 461 436
s1575   [mN]-s-[szr*]           3       443 461 437
s1576   [td]-s-[#]              3       444 455 432
s1577   [td]-s-[d]              3       444 456 430
s1578   [td]-s-[t]              3       444 457 431
s1579   [td]-s-[w]              3       444 458 429
s1580   [td]-s-[m]              3       444 458 434
s1581   [td]-s-[pb]             3       444 458 435
s1582   [td]-s-[fv]             3       444 458 437
s1583   [td]-s-[hj]             3       444 459 429
s1584   [td]-s-[kgxGN]          3       444 459 433
.......
r2127   [pbkgfvxG*#]-r-[gxG]    3 552 573 564
r2128   [pbkgfvxG*#]-r-[p]      3 552 574 563
r2129   [pbkgfvxG*#]-r-[bfv*]   3 552 574 564
2130   -*-                     1       575

Each line in a .cd-file describes a single allophonic unit.

The allophonic identifier is the concatenation of the 'context-independent' phone and a unique numerical identifier. The numerical is unique within the set of allophones of all phones, it is not the n'th variant of a phone.

Which context classes define the allophonic variant is given in the second field. The center phone is given between '-'s, the clustered contexts are given in square brackets. The example above shows triphone models, i.e. left and right context are defined. Quinphones are represented by [L2][L1]-ph-[R1][R2] , ... in which context lists are to be interpreted as 'OR' lists. Not specifying left or right or both contexts implies that there is no context dependency.

The last few columns give the absolute state numbers. There is no file that gives a list of states. The software will infer the list from the .cd file and assume consistency with the HMM Files which contain no names of states and are mere lists of numbers.

In some programs it is more convenient to reference states relative to the allophone, in that case we will use the notation <phone><identifier>#<relative_state_number> in which state numbers are 0,1,2, ... Thus "r2129\#1" would refer to absolute state number "574" in the above example.

Remarks

The '#' symbol is used for a number of different meanings in the SPRAAK package. While never leading to parsing problems, it may somewhat hamper readability:

Legacy

The '.arcd' file looks very much like the '.cd' file, except that it does not contain the field specifying the context dependency of the unit; thus, it contains: unit_name, number_of_states and state_numbers. Hence it was only suited for CI topology specifications. This role is now fully taken by the '.cd' file.