SPRAAK
 All Data Structures Namespaces Files Functions Variables Typedefs Enumerations Enumerator Groups Pages
spr_space_cake.c File Reference

Continue Automatische Klank-Evaluatie. More...

Detailed Description

Continue Automatische Klank-Evaluatie.

spr_space_cake [-help](flag: ) [-V](flag: ) <-c configuration file> <-p parser>
    [-s communication socket](localhost:20100) [-e error stream] [-a audio dump]
Parameters
-helpflag
Give help on the command that can be sent over the command socket.
-Vflag
Be verbose, i.e. echo all output to stdout as well.
-c<em>configurationfile
File containing the configuration options.
-p<em>parser</em><aname="spr_space_cake.p" class="el">
program that parses the description of a exercise and make the necessary lexicon and LM files.
-s<em>communicationsocket
The tcp communication socket, specified as <host>:<port_nr> (client mode) or localhost:<port_nr> (server mode).
-e<em>errorstream
File/stream to write error messages and debug information to.
-a<em>audiodump
File/stream to dump the audio to (raw format).

Continue Automatische Klank-Evaluatie. A configurable speech interface using sockets for communication with other processes. Designed for the SPACE project.

The monitor output starts with a square bracket open '[' on a single line and stops at a square bracket close ']' on a single. In between the brackets, the recognition progress is reported whenever the recognizer is certain about a new word in the single best output string or when some timer is expired (the times value can be set in the .ini file).

The lines reporting on the recognition progress have the following format:

.
no audio input, waiting
+ <word> <t0> <duration> <score> <effort> <lm_prob> <lm_transition>
There is an extra word <word> in the single best output string of which the recognizer is certain. The word starts at <t0> (integer,milliseconds) and has a duration of <duration> (integer,milliseconds). The correpsonding acoustic score is <score> (float), and the average search effort spent when decoding this word was <effort> (float in the range ]0,1.epsilon]). The <lm_transition> fields explain the language model transition taken when adding this word.
| <word> <t0> <duration> <score> <effort> <lm_prob> <lm_transition>
The remaining words on the current best path. The format is identical to the '+' output.
% <effort>
The average search effort over the newly added word ('+') and hypothesized words ('|').
@ <word> <t0> <duration> <score> <lm_state> <lm_depth> <next_lm_state> <next_lm_depth> <lm_prob> <lm_transition>
Details on the whereabouts of the current best token. If the word is not known, <word> will equal the string "&lt;PARTIAL&gt;".

Note that the last output per update is a '@' line.

On top of the standard [sections]->(key,value) tuples (see Higher level API configuration file and table format), the following extra tuples are used by spr_space_cake:

The exercise info has to consist of the following:

Known command that can be send to spr_space_cake via the socket:


[Commands]
select <name> ...
Select a given sentence from an exercise and start the recognition.
synchronize <id> ...
Send back a synchronization marker (exact copy of the input, with leading/trailing spaces removed).
audio.flush <iframe>
Discard all audio up till (but not including) the specified frame number.
audio.discard <on/off>
Discard audio between commands or not.
audio.ping
How many audio-frames have been processed (or discarded) till now.
audio.open <host:port>
Open the audio socket by connecting to the given port on the given host computer.
audio.close
Close the audio socket.
audio.format
Query about the desired audio format. Send back the following line:
audio.format <sample_freq> <bits_per_sample> <nchan> <frame_size_in_bytes> <frame_shift_in_seconds>
exercise.open <end_marker> ...
Open an exercise. Specifications follow in subsequent lines. The end of the specifications is indicated by a line containing only the <end_marker>.
exercise.close
Close the exercise.
data.put <nbytes> <fname> ...
Move data from client to server. Paths are relative w.r.t. a configurable start path. The data stream must follow after the new-line closing this command and must be <nbytes> long.
data.get <fname> ...
Move data from server to client. Paths are relative w.r.t. a configurable start path. First the size of the data is sent (ascii, single line), the data follows directly thereafter.
quit
Quit.

Date
April 2005
Author
Kris Demuynck
Revision History:
26/04/2005 - KD
Created based on the spraak_socket example program.
15/02/2011 - KD
Richer output