SPRAAK
|
Speech Detection. More...
Data Structures | |
struct | SprSDFX |
struct | SprSDKD |
Functions | |
void | spr_sd_free (SprSspInfo *Info) |
int | spr_sd_setup (SprSspInfo *Info, const char **descript, void *aux_info) |
int | spr_SD_KD_process (SprSDKD *sd, float x) |
void | spr_SD_KD_reset (SprSDKD *sd) |
int | spr_sd_process (SprSspInfo *Info, const void *frame_in, void *frame_out) |
void | spr_sd_reset (SprSspInfo *Info, SprSspStatus *action) |
Speech Detection.
Speech detection based on energy. All parameters are set for detection on power values, so energy should not be in dB!
[speech_detect] | |
---|---|
method <FX/KD>(FX) | |
Speech detection algorithm to be used. | |
db2pow | |
Input is in Db, convert first to power levels as required by the speech detectors. | |
reset <yes/no>(yes) | |
Reset the speech detectors at the beginning of each new file. | |
multi_spkr <N> <copy/move> <buf_name> | |
Setup the sil/spch detector to work in a multi-speaker environment, i.e. noise/speech statisctics for the N last (leat recent used) speaker id's are calculated. The speaker id's are input from a named buffer. | |
sil_range <min_range>(0.0) <max_range>(0.0) | |
KD: The <min_range> parameter is needed for signals that are almost noise free, and gives a lower bound on the amplitude of the noise. The <max_range> gives an upper bound on the amplitude of the noise, and gives more robustness if the speech detector is started in a speech segment. | |
spch_raise <raise>(0.25) | |
KD: minimum raise in the speech level (averaged over 4 consecutive frames) needed to detect the very first word. | |
spch_range <range>(20.0) | |
KD: signals (amplitude) below the average speech level divided by <range> are also treated as noise. |