SPRAAK
 All Data Structures Namespaces Files Functions Variables Typedefs Enumerations Enumerator Groups Pages
Conventions for Frames
Frames in Speech Processing

SPRAAK uses frame synchronous signal processing when converting sampled data to a sequence of feature vectors.

Following naming conventions are used:

FSHIFT, FLENGTH, TIMEBASE are the relevant keys in the SPRAAK file headers.

FRAME Convention in SPRAAK

In SPRAAK the absolute position of a frame is uniquely defined by the number of a frame (IFR) and the frameshift parameter (here expressed in number of samples): [ IFR*FSHIFT-FSHIFT/2 : IFR*SHIFT+FSHIFT/2-1 ] This is graphically shown in the diagram below:

      0123456789012345678901234567890123..      sample index (last digit shown only)
      ||||||||||||||||||||||||||||||||||||      sampled data

                                                FSHIFT=10, FLENGTH=20
****012345678901234                           frame 0 [-5:14]
           56789012345678901234                 frame 1 [5:24]
                     56789012345678901234       frame 2 [15:34]
 
                                                FSHIFT=10, FLENGTH=14
*012345678901                             frame 0 [-2:11]
              89012345678901                    frame 1 [8:21]
                        89012345678901          frame 2 [18:31]
Motivation

This approach has following significant advantages:

This approach also has one minor drawback. Computations involving initial and final frames will normally require data that extends beyond the file boundaries. The following solutions are offered to this missing data problem:

Note
This rarely leads to problems as more often than not the initial and final data is (stationary) noise.