Speech Detection. More...

Functions
void	spr_sd_free (SprSspInfo *Info)

int	spr_sd_setup (SprSspInfo Info, const char descript, void aux_info)

int	spr_SD_KD_process (SprSDKD *sd, float x)

void	spr_SD_KD_reset (SprSDKD *sd)

int	spr_sd_process (SprSspInfo Info, const void frame_in, void *frame_out)

void	spr_sd_reset (SprSspInfo Info, SprSspStatus action)

Detailed Description

Speech Detection.

Speech detection based on energy. All parameters are set for detection on power values, so energy should not be in dB!

[speech_detect]
`method <FX/KD>(FX)`
Speech detection algorithm to be used.
`db2pow`
Input is in Db, convert first to power levels as required by the speech detectors.
`reset <yes/no>(yes)`
Reset the speech detectors at the beginning of each new file.
`multi_spkr <N> <copy/move> <buf_name>`
Setup the sil/spch detector to work in a multi-speaker environment, i.e. noise/speech statisctics for the N last (leat recent used) speaker id's are calculated. The speaker id's are input from a named buffer.
`sil_range <min_range>(0.0) <max_range>(0.0)`
KD: The <min_range> parameter is needed for signals that are almost noise free, and gives a lower bound on the amplitude of the noise. The <max_range> gives an upper bound on the amplitude of the noise, and gives more robustness if the speech detector is started in a speech segment.
`spch_raise <raise>(0.25)`
KD: minimum raise in the speech level (averaged over 4 consecutive frames) needed to detect the very first word.
`spch_range <range>(20.0)`
KD: signals (amplitude) below the average speech level divided by <range> are also treated as noise.

Author: Fei Xie (algorithm); Tom Claes (implementation of functions); Kris Demuynck (second method)

Revision History:

Data Structures