SPRAAK
|
A sentence or paragraph is transcribed as a sequence of words:
Sentence this_is_a_sentence
SPRAAK uses the "_" symbol as an explicit word separator in order to make parsing unique and to give an indication that something might be happening (optional silence, cross-word assimilation, .. ). However, this allows the transcription to be written as a single STRING, without the need for quotes which simplifies parsing throughout the package.
A number of words have a reserved or at least recommend usage.
A lexicon (Lexicon File) contains the canonical transcription of words in terms of phones, or more generally in acoustic units as SPRAAK can use any user defined subword unit such as phones, syllables, morphs, or full words.
Example:
hij [i/I/hE+[j/]] hij [i]/[I]/[hE+]/[hE+j]
The above example shows 2 different ways of representing the 4 pronunciation variants of the Dutch word 'hij'.
A[(.5)B/(.8)C(.5)D/(.1)[]]E
Assimilation rules may optionally be added to a lexicon. They are described together with rules applying to word concatenations in Word Concatenation and Assimilation Rules.
Today's systems often rely on the assignment of different acoustic models to a phone depending on the context. Context-dependent phones are written as the concatenation of the context-independent phone and a unique numerical identifier, that is an absolute number spanning over ALL phones; hence it gives the n'th allophone of the alphabet (not the n'th allophone of the specified ci_phone).
mist m245 i27 s1345 t4378
The above example gives a context-dependent transcription of the phoneme string mist
using cd units 245,27,1345,4378
.
The context corresponding to a given allophone is specified in the .cd file (Acoustic Unit File).
r2128 [pbkgfvxG*#]-r-[p]
The right-hand side of this defintions shows left- and right- context for a triphonic model. Quinphones are represented by [L2][L1]-ph-[R1][R2] , ... in which context lists are to be interpreted as 'OR' lists.
The states belonging to a phone are specified in the Acoustic Unit File. States are indicated by numbers (counting starting at 0) and entities by themselves, i.e. they are not private to an acoustic unit, but can be shared across as many units as wanted.
Most often states will be referenced by their numerical identifier, though i In certain occasions it may be more handy not to use the absolute numerical identifier, but to use a reference which involves the allophonic identity, which can be done e.g. as i27#0 which refers to:
i
The '#' symbol is used for a number of different meanings in the SPRAAK package. While never leading to parsing problems, it may somewhat hamper readability:
Assimilation Rules still use HMM7.5 implemenation
Probabilistic Pronunciation Variants are NOT IMPLEMENTED YET