Speech interface using sockets.
spr_spraak_socket [-V](flag: verbose) [-NM](flag: no_monitor) [-F](flag: read_from_file)
<-c configuration file> [-s communication socket](localhost:20100) [-i input pipe]
[-o output pipe] [-e error stream]
- Parameters
-
-Vflag | verbose
Be verbose, i.e. echo all output to stdout as well. |
-NMflag | no_monitor
Do not monitor the recognition progress, only output the final result. |
-Fflag | read_from_file
Read audio input from files instead of from the audio device. |
-c<em>configuration | file
File containing the configuration options. |
-s<em>communication | socket
The tcp communication socket, specified as <host>:<port_nr> (client mode) or localhost:<port_nr> (server mode). |
-i<em>input | pipe
Use a named pipe (or file) instead of a socket for the input. |
-o<em>output | pipe
Use a named pipe (or file) instead of a socket for the output. |
-e<em>error | stream
File/stream to write error messages and debug information to. |
Speech interface using sockets. A configurable speech interface using sockets for communication with other processes.
During recognition, immediate feedback is given as follows:
The monitor output starts with a square bracket open '[' on a single line and stops at a square bracket close ']' on a single. In between the brackets, the recognition progress is reported whenever the recognizer is certain about a new word in the single best output string or when some timer is expired (the times value can be set in the .ini file).
The lines reporting on the recognition progress have the following format:
- .
- no audio input, waiting
- + <word> <t0> <duration>
- There is an extra word <word> in the single best output string of which the recognizer is certain. The word starts at <t0> (integer,milliseconds) and has a duration of <duration> (integer,milliseconds).
- | <word> ...
- The remaining words on the current best path.
- % <effort>
- The average search effort over the newly added word ('+') and hypothesized words ('|'). The effort is calculated as the average number of active tokens divided by the maximum number of tokens and is a float in the range ]0,1.epsilon].
The final result is reported as follows:
- = <word> ...
- A single line containing all words.
- ? <word> <t0>:<duration> bw=<effort> ac=<am_score> lf=<lm_score_fwd> lb=<lm_score_bw>
- One line per word. The word starts at <t0> (integer,milliseconds) and has a duration of <duration> (integer,milliseconds). When decoding the word, the average search effort (number of tokens divided by the maximum number of tokens) equaled <effort>. The acoustic score was <am_score> (this score is normalized when using the default acoustic modelling, such normalized scores are better indicators of the word confidence compared to the unnormalized scores). The LM score was <lm_score_fwd> (the normal forward LM score, i.e. P(<word>|<previous_words>). The backward LM score was <lm_score_bw> (evaluating this score is not always relevant and requires time, hence an extra flag must be set in the .ini file if this value is to be calculated; see Higher level API configuration file and table format for more details).
Errors are reported as follows:
- ! <msg>
Known command that can be send to spr_spraak_socket via the socket (or via pipes/files):
[commands] |
sync <sync_nr>
| |
Set a synchronisation mark in the data stream: copy the command to the output file/pipe. |
change lex <lexicon>
| |
Change the lexicon. |
change lm <lang_mod>
| |
Change the language model. |
change both <lex> <lm>
| |
Change the lexicon and language model. |
file <fname>
| |
Process the audio in the given file (raw format). |
sleep
| |
Flush + ignore all audio input till a wakeup command. |
flush
| |
Flush all audio data still waiting to be processed. |
wakeup
| |
Start listening again after a sleep command. |
timeout <sec> [quick](0)
| |
Time out the recognition process after a given amount of seconds; see also SPRaak_abort(). |
echo [str] ...
| |
Echo something back. |
help [str] ...
| |
Give help (on something). |
quit
| |
Quit. |
- Author
- Kris Demuynck
- Date
- 4/2002 - KD
- Created based on the (simple) interface layer build for the Mythe project.
- 4/2002 - KD
- Added code for direct recording from the linux audio device
- 6/2002 - KD
- Added code for direct recording from the windows audio device