SPRAAK
 All Data Structures Namespaces Files Functions Variables Typedefs Enumerations Enumerator Groups Pages
Functions
sspmod_online_bed.c File Reference

begin/end point detection of speech More...

Functions

void spr_bed_free (SprSspInfo *Info)
 
int spr_bed_setup (SprSspInfo *Info, const char **descript, void *aux_info)
 
int spr_bed_process (SprSspInfo *Info, const void *frame_in, void *frame_out)
 
void spr_bed_reset (SprSspInfo *Info, SprSspStatus *action)
 

Detailed Description

begin/end point detection of speech

This program removes the leading and trailing noise frames from a data stream.


[online_bed]
nfr_lead <number>(20)
Number of extra leading noise frames.
nfr_tail <number>(20)
Number of extra trailing noise frames.
spch_win <number>(60)
Window used to detect a speech fragment.
min_spch <number>(30)
Min. number of speech frames in the speech window.
sil_win <number>(60)
Window used to detect silence fragments.
max_spch <number>(10)
Max. number of speech frames in the silence window.
clean_sil_spch
Return clean silence speech codes instead of removing frames.
trigger_pos
Return the trigger positions only (nfr with respect to the previous state).
method <sent/trunc/glob>(glob)
Determines how bed works.
The <sent> mode work in a continuous mode, e.g. marking sentences in a dialog.
The <trunc> mode will detect the first speech segment only (the rest will be classified as noise).
The <glob> method will remove leading and trailing noise in a sentence, intermediate noise will be marked as speech to.
sil_range <min_range>(0.0) <max_range>(0.0)
Build in SpchDet: The <min_range> parameter is needed for signals that are almost noise free, and gives a lower bound on the amplitude of the noise. The <max_range> gives an upper bound on the amplitude of the noise, and gives more robustness if the speech detector is started in a speech segment.
spch_raise <raise>(0.25)
Build in SpchDet: minimum raise in the speech level (averaged over 4 consecutive frames) needed to detect the very first word.
spch_range <range>(20.0)
Build in SpchDet: signals (amplitude) below the average speech level divided by <range> are also treated as noise.

Author
Kris Demuynck
Date
12 September 1995