Speech interface using sockets. More...

Detailed Description

Speech interface using sockets.

spr_spraak_socket [-V](flag: verbose) [-NM](flag: no_monitor) [-F](flag: read_from_file)
    <-c configuration file> [-s communication socket](localhost:20100) [-i input pipe]
    [-o output pipe] [-e error stream]

Parameters

-Vflag	verbose Be verbose, i.e. echo all output to stdout as well.
-NMflag	no_monitor Do not monitor the recognition progress, only output the final result.
-Fflag	read_from_file Read audio input from files instead of from the audio device.
-c<em>configuration	file File containing the configuration options.
-s<em>communication	socket The tcp communication socket, specified as <host>:<port_nr> (client mode) or localhost:<port_nr> (server mode).
-i<em>input	pipe Use a named pipe (or file) instead of a socket for the input.
-o<em>output	pipe Use a named pipe (or file) instead of a socket for the output.
-e<em>error	stream File/stream to write error messages and debug information to.

Speech interface using sockets. A configurable speech interface using sockets for communication with other processes.

During recognition, immediate feedback is given as follows:

The monitor output starts with a square bracket open '[' on a single line and stops at a square bracket close ']' on a single. In between the brackets, the recognition progress is reported whenever the recognizer is certain about a new word in the single best output string or when some timer is expired (the times value can be set in the .ini file).

The lines reporting on the recognition progress have the following format:

.: no audio input, waiting
+ <word> <t0> <duration>: There is an extra word <word> in the single best output string of which the recognizer is certain. The word starts at <t0> (integer,milliseconds) and has a duration of <duration> (integer,milliseconds).
| <word> ...: The remaining words on the current best path.
% <effort>: The average search effort over the newly added word ('+') and hypothesized words ('|'). The effort is calculated as the average number of active tokens divided by the maximum number of tokens and is a float in the range ]0,1.epsilon].

The final result is reported as follows:

= <word> ...: A single line containing all words.
? <word> <t0>:<duration> bw=<effort> ac=<am_score> lf=<lm_score_fwd> lb=<lm_score_bw>: One line per word. The word starts at <t0> (integer,milliseconds) and has a duration of <duration> (integer,milliseconds). When decoding the word, the average search effort (number of tokens divided by the maximum number of tokens) equaled <effort>. The acoustic score was <am_score> (this score is normalized when using the default acoustic modelling, such normalized scores are better indicators of the word confidence compared to the unnormalized scores). The LM score was <lm_score_fwd> (the normal forward LM score, i.e. P(<word>|<previous_words>). The backward LM score was <lm_score_bw> (evaluating this score is not always relevant and requires time, hence an extra flag must be set in the .ini file if this value is to be calculated; see Higher level API configuration file and table format for more details).

Errors are reported as follows:

! <msg>

Known command that can be send to spr_spraak_socket via the socket (or via pipes/files):

[commands]
`sync <sync_nr>`
Set a synchronisation mark in the data stream: copy the command to the output file/pipe.
`change lex <lexicon>`
Change the lexicon.
`change lm <lang_mod>`
Change the language model.
`change both <lex> <lm>`
Change the lexicon and language model.
`file <fname>`
Process the audio in the given file (raw format).
`sleep`
Flush + ignore all audio input till a wakeup command.
`flush`
Flush all audio data still waiting to be processed.
`wakeup`
Start listening again after a sleep command.
`timeout <sec> [quick](0)`
Time out the recognition process after a given amount of seconds; see also SPRaak_abort().
`echo [str] ...`
Echo something back.
`help [str] ...`
Give help (on something).
`quit`
Quit.

Author: Kris Demuynck

Date

4/2002 - KD: Created based on the (simple) interface layer build for the Mythe project.
4/2002 - KD: Added code for direct recording from the linux audio device
6/2002 - KD: Added code for direct recording from the windows audio device