SPRAAK
 All Data Structures Namespaces Files Functions Variables Typedefs Enumerations Enumerator Groups Pages
Lattice File
Contents

A lattice file contains a set of lattices in FST format. It is an ASCII file that is in many respects similar to corpus and segmentation files. Lattices may contain information of all kind (word, phone, template, ... ).

Keys
  • DATA should be set to LATTICE
  • TYPE will standard be STRING
  • DIM1 shows the number of lines in the data section
  • DIM2 shows the number of lattices in the file
Data

A file may contain multiple lattices. Each lattice starts with an information line

File <fname> [btime] [etime] [extra_info ... ]

This initialization line is followed by lines describing the lattice in FST format. Each line may contain any of the following:

O <node_nr> <iframe> [extra_info ... ]
A <bnode> <enode> <isymbol> [score] [osymbol] [options .. ]
C <node_nr>

in which

Nodes have following properties:

More information on lattice files and processing is found in wlat_master.c

Example
.spr
DATA    LATTICE
TYPE    STRING
DIM1    1905
DIM2    1 
SYMBOLTYPE      LITERAL
#
File: /data/wsj/sam_16k/wsj0/si_dt_20/050/050c0301.wv1 0 810
O -1
O 0 0
O 33 71
A 0 33 0 76.2348
O 25 69
A 0 25 0 74.518
O 20 71
A 0 20 0 76.2348
O 16 72
A 0 16 0 75.8263
O 8 70
A 0 8 0 75.4815
O 6 70
A 0 6 0 75.4815
...
A 25 28 13 5.7582
O 26 77
A 25 26 13 5.04793
C 25
O 4 72
A 3 4 5 -7.97319
C 3
O 2 72
A 1 2 37 -13.2876
...
Tools

For lattice operations cfr. Lattice processing modules .

The SPRAAK lattices can be processed with external FST tools (eg. MIT toolbox); for conversion instructions see ...

Extensions

Due to demands from ongoing research projects, both the lattice tools and corresponding file formats have been used in a 'loose manner'; i.e. the above standards aren't strictly followed in all tools, especially w.r.t. to the A(rc) descriptions:

Bugs and Limitations

Standardization is lacking in many ways: