SPRAAK
|
The distance matrix can, if desired, be specified using features which allows for a decomposition of the problem.
A simple distance matrix has the following format
[distance] / / it1 it2 it3 ... / C00 C01 C02 C03 ... it1 C10 C11 C12 C13 ... it2 C20 C21 C22 C23 ... it3 C30 C31 C32 C33 ... ... ... ... ... ... ...
The elements in the example above have the following function:
For a feature based matrix, one first has to list the features and the mapping between items and features:
[features] % some comment feat1 feat2 ... [feature_map] it1 featX it2 featY ... [distance] ...
The elements in the example above have the following function:
Multiple (sub) distance matrices may be defined. The cost assigned to an operation (sub/ins/del) is the sum of the costs defined in the various (sub) distance matrices. This allows for a simple feature based decomposition of the costs.
A extensive example is given below; all costs were assigned add-hoc:
% phoneme properties % = can be considered to consist out of the given two parts % X specials: silence / speaker noises % C consonant % p plosief (occlusive) % f fricatief % n nasaal % l lateraal+trill % g glide (approximant) % bl bilabiaal % ld labiodentaal % al alveolair % rf retroflex % pt palataal % vl velair % gt glottaal % v0 unvoiced % v1 voiced % V vowel % o1 close % o2 close-mid % o3 open-mid % o4 open % p1 front % p2 front-central % p3 central % p4 back % s0 long % s1 short % ! stressed % % Phoneme table % # X (f-v0) % </s> = # % p C-p-bl-v0 % b C-p-bl-v1 % t C-p-al-v0 % d C-p-al-v1 % T C-p-rf-v0 % D C-p-rf-v1 % k C-p-vl-v0 % g C-p-vl-v1 % f C-f-ld-v0 % v C-f-ld-v1 % s C-f-al-v0 % z C-f-al-v1 % S C-f-pt-v0 % S,Z were made palatal instead of postalveolar (closer to the /j/ + postalveolar was not in use) % Z C-f-pt-v1 % h C-f-gt-v0 % m C-n-bl-v1 % n C-n-al-v1 % N C-n-vl-v1 % l C-l-al-v1-t0 % t0/t1 == trill % r C-g-al-v1-t1 % classification based on the Chinese pronunciation of the /r/ :-) % j C-g-pt-v1 % w C-g-bl-v1 % t+S = t S % d+Z = d Z % Note: last column indicates the linking phoneme the vowel induces (j/w/?) % i V-o1-p1-s0-pt % worry-j-above % E V-o3-p1-s1-pt % % { V-o4-p1-s1-pt % % I V-o1-p2-s1-pt % % I! V-o1-p2-s1-pt! % % @ V-o2-p3-s1-gt % extra-?-above % @! V-o2-p3-s1-gt! % % u V-o1-p4-s0-bl % flew-w-above % U V-o1-p4-s1-bl % % @+U V-o2-p4-s0-bl % yellow-w-above % O V-o3-p4-s1-bl % outlaw-w-above % A V-o4-p4-s1-gt % panama-?-above % } = @ r % }! = @! r % e+I = E j % a+U = A w % a+I = a j % O+I = O j [features] % major distinction between all 'phonemes' # V Cp CP Cf Cn Cl Cg [feature_map] # # </s> # p Cp b Cp t Cp d Cp T CP D CP k Cp g Cp f Cf v Cf s Cf z Cf S Cf Z Cf h Cf m Cn n Cn N Cn l Cl r Cl j Cg w Cg t+S CP d+Z CP i V E V { V I V I! V @ V @! V u V U V @+U V O V A V } Cl }! Cl e+I V a+U V a+I V O+I V [distance_matrix] / # V Cp CP Cf Cn Cl Cg / # 0.0 1.0 0.5 0.5 0.5 1.0 1.0 1.0 0.5 V 1.0 0.0 1.0 1.0 1.0 0.5 0.7 0.5 1.0 Cp 0.5 1.0 0.0 0.1 0.3 1.0 1.0 1.0 0.5 CP 0.5 1.0 0.1 0.0 0.2 1.0 1.0 1.0 0.5 Cf 0.5 1.0 0.3 0.2 0.0 1.0 1.0 1.0 0.5 Cn 1.0 0.5 1.0 1.0 1.0 0.0 0.7 0.5 1.0 Cl 1.0 0.7 1.0 1.0 1.0 0.7 0.0 0.7 1.0 Cg 1.0 0.5 1.0 1.0 1.0 0.5 0.7 0.0 0.7 / 0.5 1.0 0.5 0.5 0.5 1.0 1.0 0.7 0.0 [features] % further distinction between the consonants w.r.t. manner bl ld al rf pt vl gt [feature_map] p bl b bl t al d al T rf D rf k vl g vl f ld v ld s al z al S pt Z pt h gt m bl n al N vl l al r al j pt w bl t+S al d+Z al } al }! al [distance_matrix] / bl ld al rf pt vl gt bl 0.0 0.2 0.4 0.4 0.4 0.4 0.4 ld 0.2 0.0 0.2 0.4 0.4 0.4 0.4 al 0.4 0.2 0.0 0.2 0.2 0.4 0.4 rf 0.4 0.4 0.2 0.0 0.2 0.2 0.4 pt 0.4 0.4 0.2 0.2 0.0 0.2 0.4 vl 0.4 0.4 0.4 0.2 0.2 0.0 0.2 gt 0.4 0.4 0.4 0.4 0.4 0.2 0.0 [features] % further distinction between the plosives and fricatives w.r.t. voiced/unvoiced v0 v1 [feature_map] p v0 b v1 t v0 d v1 T v0 D v1 k v0 g v1 f v0 v v1 s v0 z v1 S v0 Z v1 h v0 t+S v0 d+Z v1 } v1 }! v1 [distance_matrix] / v0 v1 v0 0.0 0.1 v1 0.1 0.0 [features] % further distinction between trill and others t0 t1 tx [feature_map] * tx l t0 r t1 } t1 }! t1 [distance_matrix] / t0 t1 tx t0 0.0 0.3 0.0 t1 0.3 0.0 0.1 tx 0.0 0.1 0.0 [features] % further distinction between the vowels w.r.t. place o1 o2 o3 o4 [feature_map] i o1 E o3 { o4 I o1 I! o1 @ o2 @! o2 u o1 U o1 @+U o2 O o3 A o4 } o2 }! o2 e+I o3 a+U o4 a+I o4 O+I o3 [distance_matrix] / o1 o2 o3 o4 o1 0.0 0.1 0.2 0.2 o2 0.1 0.0 0.1 0.2 o3 0.2 0.1 0.0 0.1 o4 0.2 0.2 0.1 0.0 [features] % further distinction between the vowels w.r.t. openness p1 p2 p3 p4 [feature_map] i p1 E p1 { p1 I p2 I! p2 @ p3 @! p3 u p4 U p4 @+U p4 O p4 A p4 } p3 }! p3 e+I p1 a+U p4 a+I p1 O+I p4 [distance_matrix] / p1 p2 p3 p4 p1 0.0 0.1 0.2 0.2 p2 0.1 0.0 0.1 0.2 p3 0.2 0.1 0.0 0.1 p4 0.2 0.2 0.1 0.0 [features] % further distinction between the vowels w.r.t. length s0 s1 [feature_map] i s0 E s1 { s1 I s1 I! s1 @ s1 @! s1 u s0 U s1 @+U s0 O s1 A s1 } s1 }! s1 e+I s0 a+U s0 a+I s0 O+I s0 [distance_matrix] / s0 s1 s0 0.0 0.1 s1 0.1 0.0 [features] % further distinction between the vowels w.r.t. stress s0 s1 [feature_map] i s1 E s1 { s1 I s0 I! s1 @ s0 @! s1 u s1 U s1 @+U s1 O s1 A s1 } s0 }! s1 e+I s1 a+U s1 a+I s1 O+I s1 [distance_matrix] / s0 s1 s0 0.0 0.1 s1 0.1 0.0 [features] % further distinction between the vowels w.r.t. manner pt gt bl [feature_map] i pt E pt { pt I pt I! pt @ gt @! gt u bl U bl @+U bl O bl A gt } gt }! gt e+I pt a+U bl a+I pt O+I pt [distance_matrix] / pt gt bl pt 0.0 0.2 0.2 gt 0.2 0.0 0.2 bl 0.2 0.2 0.0 [features] % difference between diphtongs/monophtongs/sylabics 1 ^ , [feature_map] i 1 E 1 { 1 I 1 I! 1 @ 1 @! 1 u 1 U 1 @+U 1 O 1 A 1 } , }! , e+I ^ a+U ^ a+I ^ O+I ^ [distance_matrix] / 1 ^ , 1 0.0 0.1 0.1 ^ 0.1 0.0 0.3 , 0.1 0.3 0.0 [features] % correction for the diphtongs/sylabics 2j 2w 2r n j w lr @ [feature_map] S 2j Z 2j t+S 2j d+Z 2j j j w w l lr r lr i 2j u 2w U 2w @+U 2w e+I 2j a+U 2w a+I 2j O+I 2j } 2r }! 2r @ @ @! @ I @ I! @ E @ { @ [distance_matrix] / @ j w lr 2j 2w 2r @ 0.0 0.0 0.0 0.0 0.0 0.0 -.6 j 0.0 0.0 0.0 0.0 -.1 0.0 0.0 w 0.0 0.0 0.0 0.0 0.0 -.1 0.0 lr 0.0 0.0 0.0 0.0 0.0 0.0 0.1 2j 0.0 -.1 0.0 0.0 0.0 0.0 0.0 2w 0.0 0.0 -.1 0.0 0.0 0.0 0.0 2r -.6 0.0 0.0 0.1 0.0 0.0 0.0