SPRAAK
 All Data Structures Namespaces Files Functions Variables Typedefs Enumerations Enumerator Groups Pages
Distance matrices

The distance matrix can, if desired, be specified using features which allows for a decomposition of the problem.

A simple distance matrix has the following format

[distance]
/       /       it1     it2     it3     ...
/       C00     C01     C02     C03     ...
it1     C10     C11     C12     C13     ...
it2     C20     C21     C22     C23     ...
it3     C30     C31     C32     C33     ...
...     ...     ...     ...     ...     ...

The elements in the example above have the following function:

For a feature based matrix, one first has to list the features and the mapping between items and features:

[features]
% some comment
 feat1
 feat2
 ...
[feature_map]
 it1 featX
 it2 featY
 ...
[distance]
 ...

The elements in the example above have the following function:

Multiple (sub) distance matrices may be defined. The cost assigned to an operation (sub/ins/del) is the sum of the costs defined in the various (sub) distance matrices. This allows for a simple feature based decomposition of the costs.

A extensive example is given below; all costs were assigned add-hoc:

% phoneme properties
%  =    can be considered to consist out of the given two parts
% X     specials: silence / speaker noises
% C     consonant
%  p    plosief (occlusive)
%  f    fricatief
%  n    nasaal
%  l    lateraal+trill
%  g    glide (approximant)
%  bl   bilabiaal
%  ld   labiodentaal
%  al   alveolair
%  rf   retroflex
%  pt   palataal
%  vl   velair
%  gt   glottaal
%  v0   unvoiced
%  v1   voiced
% V     vowel
%  o1   close
%  o2   close-mid
%  o3   open-mid
%  o4   open
%  p1   front
%  p2   front-central
%  p3   central
%  p4   back
%  s0   long
%  s1   short
%  !    stressed
%
% Phoneme table
%  #    X       (f-v0)
%  </s> =       #
%  p    C-p-bl-v0
%  b    C-p-bl-v1
%  t    C-p-al-v0
%  d    C-p-al-v1
%  T    C-p-rf-v0
%  D    C-p-rf-v1
%  k    C-p-vl-v0
%  g    C-p-vl-v1
%  f    C-f-ld-v0
%  v    C-f-ld-v1
%  s    C-f-al-v0
%  z    C-f-al-v1
%  S    C-f-pt-v0       % S,Z were made palatal instead of postalveolar (closer to the /j/ + postalveolar was not in use)
%  Z    C-f-pt-v1
%  h    C-f-gt-v0
%  m    C-n-bl-v1
%  n    C-n-al-v1
%  N    C-n-vl-v1
%  l    C-l-al-v1-t0    % t0/t1 == trill
%  r    C-g-al-v1-t1    % classification based on the Chinese pronunciation of the /r/ :-)
%  j    C-g-pt-v1
%  w    C-g-bl-v1
%  t+S  =       t       S
%  d+Z  =       d       Z
% Note: last column indicates the linking phoneme the vowel induces (j/w/?)
%  i    V-o1-p1-s0-pt   % worry-j-above
%  E    V-o3-p1-s1-pt   % 
%  {    V-o4-p1-s1-pt   %
%  I    V-o1-p2-s1-pt   % 
%  I!   V-o1-p2-s1-pt!  %
%  @    V-o2-p3-s1-gt   % extra-?-above         
%  @!   V-o2-p3-s1-gt!  %
%  u    V-o1-p4-s0-bl   % flew-w-above
%  U    V-o1-p4-s1-bl   %
%  @+U  V-o2-p4-s0-bl   % yellow-w-above
%  O    V-o3-p4-s1-bl   % outlaw-w-above
%  A    V-o4-p4-s1-gt   % panama-?-above
%  }    =       @       r
%  }!   =       @!      r
%  e+I  =       E       j
%  a+U  =       A       w
%  a+I  =       a       j
%  O+I  =       O       j

[features]
% major distinction between all 'phonemes'
  #
  V
  Cp
  CP
  Cf
  Cn
  Cl
  Cg
[feature_map]
  #     #
  </s>  #
  p     Cp
  b     Cp
  t     Cp
  d     Cp
  T     CP
  D     CP
  k     Cp
  g     Cp
  f     Cf
  v     Cf
  s     Cf
  z     Cf
  S     Cf
  Z     Cf
  h     Cf
  m     Cn
  n     Cn
  N     Cn
  l     Cl
  r     Cl
  j     Cg
  w     Cg
  t+S   CP
  d+Z   CP
  i     V
  E     V
  {     V
  I     V
  I!    V
  @     V
  @!    V
  u     V
  U     V
  @+U   V
  O     V
  A     V
  }     Cl
  }!    Cl
  e+I   V
  a+U   V
  a+I   V
  O+I   V
[distance_matrix]
  /    #   V  Cp  CP  Cf  Cn  Cl  Cg   /
  #  0.0 1.0 0.5 0.5 0.5 1.0 1.0 1.0 0.5
  V  1.0 0.0 1.0 1.0 1.0 0.5 0.7 0.5 1.0
  Cp 0.5 1.0 0.0 0.1 0.3 1.0 1.0 1.0 0.5
  CP 0.5 1.0 0.1 0.0 0.2 1.0 1.0 1.0 0.5
  Cf 0.5 1.0 0.3 0.2 0.0 1.0 1.0 1.0 0.5
  Cn 1.0 0.5 1.0 1.0 1.0 0.0 0.7 0.5 1.0
  Cl 1.0 0.7 1.0 1.0 1.0 0.7 0.0 0.7 1.0
  Cg 1.0 0.5 1.0 1.0 1.0 0.5 0.7 0.0 0.7
  /  0.5 1.0 0.5 0.5 0.5 1.0 1.0 0.7 0.0

[features]
% further distinction between the consonants w.r.t. manner
  bl
  ld
  al
  rf
  pt
  vl
  gt
[feature_map]
  p     bl
  b     bl
  t     al
  d     al
  T     rf
  D     rf
  k     vl
  g     vl
  f     ld
  v     ld
  s     al
  z     al
  S     pt
  Z     pt
  h     gt
  m     bl
  n     al
  N     vl
  l     al
  r     al
  j     pt
  w     bl
  t+S   al
  d+Z   al
  }     al
  }!    al
[distance_matrix]
  /   bl  ld  al  rf  pt  vl  gt
  bl 0.0 0.2 0.4 0.4 0.4 0.4 0.4
  ld 0.2 0.0 0.2 0.4 0.4 0.4 0.4
  al 0.4 0.2 0.0 0.2 0.2 0.4 0.4
  rf 0.4 0.4 0.2 0.0 0.2 0.2 0.4
  pt 0.4 0.4 0.2 0.2 0.0 0.2 0.4
  vl 0.4 0.4 0.4 0.2 0.2 0.0 0.2
  gt 0.4 0.4 0.4 0.4 0.4 0.2 0.0

[features]
% further distinction between the plosives and fricatives w.r.t. voiced/unvoiced
  v0
  v1
[feature_map]
  p     v0
  b     v1
  t     v0
  d     v1
  T     v0
  D     v1
  k     v0
  g     v1
  f     v0
  v     v1
  s     v0
  z     v1
  S     v0
  Z     v1
  h     v0
  t+S   v0
  d+Z   v1
  }     v1
  }!    v1
[distance_matrix]
  /   v0  v1
  v0 0.0 0.1
  v1 0.1 0.0

[features]
% further distinction between trill and others
  t0
  t1
  tx
[feature_map]
*    tx
  l     t0
  r     t1
  }     t1
  }!    t1
[distance_matrix]
  /   t0  t1  tx
  t0 0.0 0.3 0.0
  t1 0.3 0.0 0.1
  tx 0.0 0.1 0.0

[features]
% further distinction between the vowels w.r.t. place
  o1
  o2
  o3
  o4
[feature_map]
  i     o1
  E     o3
  {     o4
  I     o1
  I!    o1
  @     o2
  @!    o2
  u     o1
  U     o1
  @+U   o2
  O     o3
  A     o4
  }     o2
  }!    o2
  e+I   o3
  a+U   o4
  a+I   o4
  O+I   o3
[distance_matrix]
  /   o1  o2  o3  o4
  o1 0.0 0.1 0.2 0.2
  o2 0.1 0.0 0.1 0.2
  o3 0.2 0.1 0.0 0.1
  o4 0.2 0.2 0.1 0.0

[features]
% further distinction between the vowels w.r.t. openness
  p1
  p2
  p3
  p4
[feature_map]
  i     p1
  E     p1
  {     p1
  I     p2
  I!    p2
  @     p3
  @!    p3
  u     p4
  U     p4
  @+U   p4
  O     p4
  A     p4
  }     p3
  }!    p3
  e+I   p1
  a+U   p4
  a+I   p1
  O+I   p4
[distance_matrix]
  /   p1  p2  p3  p4
  p1 0.0 0.1 0.2 0.2
  p2 0.1 0.0 0.1 0.2
  p3 0.2 0.1 0.0 0.1
  p4 0.2 0.2 0.1 0.0

[features]
% further distinction between the vowels w.r.t. length
  s0
  s1
[feature_map]
  i     s0
  E     s1
  {     s1
  I     s1
  I!    s1
  @     s1
  @!    s1
  u     s0
  U     s1
  @+U   s0
  O     s1
  A     s1
  }     s1
  }!    s1
  e+I   s0
  a+U   s0
  a+I   s0
  O+I   s0
[distance_matrix]
  /   s0  s1
  s0 0.0 0.1
  s1 0.1 0.0

[features]
% further distinction between the vowels w.r.t. stress
  s0
  s1
[feature_map]
  i     s1
  E     s1
  {     s1
  I     s0
  I!    s1
  @     s0
  @!    s1
  u     s1
  U     s1
  @+U   s1
  O     s1
  A     s1
  }     s0
  }!    s1
  e+I   s1
  a+U   s1
  a+I   s1
  O+I   s1
[distance_matrix]
  /   s0  s1
  s0 0.0 0.1
  s1 0.1 0.0

[features]
% further distinction between the vowels w.r.t. manner
  pt
  gt
  bl
[feature_map]
  i     pt
  E     pt
  {     pt
  I     pt
  I!    pt
  @     gt
  @!    gt
  u     bl
  U     bl
  @+U   bl
  O     bl
  A     gt
  }     gt
  }!    gt
  e+I   pt
  a+U   bl
  a+I   pt
  O+I   pt
[distance_matrix]
  /   pt  gt  bl
  pt 0.0 0.2 0.2
  gt 0.2 0.0 0.2
  bl 0.2 0.2 0.0

[features]
% difference between diphtongs/monophtongs/sylabics
  1
  ^
  ,
[feature_map]
  i     1
  E     1
  {     1
  I     1
  I!    1
  @     1
  @!    1
  u     1
  U     1
  @+U   1
  O     1
  A     1
  }     ,
  }!    ,
  e+I   ^
  a+U   ^
  a+I   ^
  O+I   ^
[distance_matrix]
  /    1   ^   ,
  1  0.0 0.1 0.1
  ^  0.1 0.0 0.3
  ,  0.1 0.3 0.0

[features]
% correction for the diphtongs/sylabics
  2j
  2w
  2r
  n
  j
  w
  lr
  @
[feature_map]
  S     2j
  Z     2j
  t+S   2j
  d+Z   2j
  j     j
  w     w
  l     lr
  r     lr
  i     2j
  u     2w
  U     2w
  @+U   2w
  e+I   2j
  a+U   2w
  a+I   2j
  O+I   2j
  }     2r
  }!    2r
  @     @
  @!    @
  I     @
  I!    @
  E     @
  {     @
[distance_matrix]
  /    @   j   w  lr  2j  2w  2r
  @  0.0 0.0 0.0 0.0 0.0 0.0 -.6
  j  0.0 0.0 0.0 0.0 -.1 0.0 0.0
  w  0.0 0.0 0.0 0.0 0.0 -.1 0.0
  lr 0.0 0.0 0.0 0.0 0.0 0.0 0.1
  2j 0.0 -.1 0.0 0.0 0.0 0.0 0.0
  2w 0.0 0.0 -.1 0.0 0.0 0.0 0.0
  2r -.6 0.0 0.0 0.1 0.0 0.0 0.0