SPRAAK
|
Design and simulate filter banks. More...
Enumerations | |
enum | { SPR_FB_WARP_LIN1, SPR_FB_WARP_LIN2, SPR_FB_NWARPMETHOD } |
Functions | |
void | spr_filter_bank_free (SprSspInfo *Info) |
int | spr_filter_bank_setup (SprSspInfo *Info, const char **descript, void *aux_info) |
int | spr_filter_bank_process (SprSspInfo *Info, const void *frame_in, void *frame_out) |
Design and simulate filter banks.
designs and evaluates filter banks based on FFT and auditory models. Four types of filterbanks are possible:
These banktypes can be combined with four non-linear frequency-scales :
- PLP > barks as in plp - MEL | - ERB |> cf. internal report 'Signal Processing for Speech Applications, Formulas and Algorithms', MI2-SPCH-91-5, p. 28 - BARK | - MELDM > linear up to 1000 Hz, logarithmic spacing above (cf. Davis & Mermelstein) - MELHTK - LINEAR
In addition, warping can be applied on the defining (begin/middle/end) frequencies of the filterbanks before the bank weights are calculated. Currently, only linear warping is implemented.
[filter_bank] | |
---|---|
scale <MEL/BARK/ERB/PLP/MELDM/MELHTK/ERBGM/LINEAR> | |
Set scale type. Default MELDM. | |
bank <REC/TRI2/SCHROEDER/TRI/GAMMA> [norm/no_norm](no_norm) | |
Set bank type. Default TRI. | |
output <dB/power>(dB) [dB_low](-379) | |
Output in dB (default) or power. When converting to dB, assure that the output is never lower than <dB_low>. | |
step <step>(1.0) | |
Go through the frequency scale with step greater (<step> > 1.0) or smaller (<step> < 1.0) than the critical bandwidth. | |
width <fac>(1.0) | |
Make the filterbank wider (<fac>>1.0) or narrower (<fac><1.0) than the default shape (which also depends on <step>). Note: not all filtershapes support this option. | |
clip <f0>(0.0) [f1](-1) | |
Only use part of the frequence range. Values are either in Herz (positive values) or in fractions of the Nyquist frequency (negative numbers). | |
Nfbank <val> | |
Specify the number of filterbanks one wants. | |
LWfixed <value> [vtl/warp](warp) | |
Use fixed value for linear warping. Default (warp), values for male subjects are smaller than one, choose vtl for inverse value meaning. | |
LWstream <copy/move> <buf_name> [vtl/warp](warp) | |
Use frame-by-frame values in buffer for linear warping. See LWfixed for vtl/warp choice. | |
LWhfreq <flat/pwl> [freq] [max_warp] | |
Options to solve the missing high frequencies in warping: default <flat> fills in a constant value namely the average power in frequencies from <freq> onwards (default a 2% range is used, so <freq> is 7840Hz for 16kHz), <pwl> is piece wise linear with 2 pieces and corner frequency <freq> (default a 10% range is used, so <freq> is 7200Hz for 16kHz). For a warping factor stream, the input warping values will be restricted to [1/max_warp max_warp], default values: 2.0 for <flat> (in this case, some memory allocations depend on max_warp), and for <pwl> the maximal value possible given <freq> (1.11111 for its default). |