SPRAAK
 All Data Structures Namespaces Files Functions Variables Typedefs Enumerations Enumerator Groups Pages
Data Structures | Macros | Typedefs | Enumerations | Functions
mlp_main.c File Reference

main routines and types for handling MLP's More...

Data Structures

struct  SprMlpLayer
 a (named) layer in the MLP More...
 
struct  SprMlpArg
 extra arguments for the (non-linear) functions More...
 
struct  SprMlpFun
 the (non-linear) function on an arc More...
 
struct  SprMlpParam
 pointers to all params for the current arc More...
 
union  _Union2_MLP_MAIN_
 optional extra parameter(s) (constants) More...
 
union  _Union3_MLP_MAIN_
 
struct  SprMlpXFun
 
struct  SprMlpProc
 
union  _Union4_MLP_MAIN_
 forward evaluation of all relevant arcs More...
 
union  _Union5_MLP_MAIN_
 backward training of all relevant arcs More...
 
struct  SprMlpConnect
 a connection between two layers More...
 
struct  SprMlpNormList
 
struct  SprWgdParam
 weights | partial_derivs | time_av_moment2 | time_av_moment1 More...
 
struct  SprCgParam
 
struct  SprMLP
 the main MLP structure More...
 

Macros

#define spr_dt_nn_float_data
 
#define SPR_MLP_MODIF_EVAL
 
#define SPR_MLP_MODIF_TRAIN_X
 
#define SPR_MLP_MODIF_TRAIN_GD
 
#define SPR_MLP_MODIF_TRAIN_WGD
 
#define SPR_MLP_MODIF_TRAIN_WGD2
 
#define SPR_MLP_MODIF_TRAIN_SCG
 
#define SPR_MLP_MODIF_TRAIN_WSCG
 
#define SPR_MLP_MODIF_TRAIN_CG
 
#define SPR_MLP_MODIF_TRAIN_WCG
 
#define spr_mlp_check_training(mlp, flags)
 

Typedefs

typedef double SprNNFloatCalc
 
typedef float SprNNFloatData
 
typedef SprNNFloatCalc(* _FuncPtr1_MLP_MAIN_ )(SprNNFloatCalc x, SprMlpParam *p, const struct t_mlp_xfun *op)
 
typedef SprNNFloatCalc(* SprMlpNLF )(SprNNFloatCalc x, SprMlpParam *p, const SprMlpXFun *op)
 
typedef SprNNFloatCalc(* SprMlpNorm )(SprNNFloatCalc dd, SprMLP *mlp)
 

Enumerations

enum  {
  SPR_MLP_OPT_TRAIN, SPR_MLP_OPT_PROP, SPR_MLP_TREE_TRAIN, SPR_MLP_LAYER_TREE,
  SPR_MLP_CONNECT_SUM, SPR_MLP_CONNECT_PROD, SPR_MLP_LAYER_INIT0, SPR_MLP_LAYER_INIT1,
  SPR_MLP_LAYER_VSET, SPR_MLP_LAYER_USED, SPR_MLP_LAYER_MUSED, SPR_MLP_LAYER_PROP,
  SPR_MLP_OPT_CONST, SPR_MLP_OPT_BLOCK, SPR_MLP_OPT_TC_SET, SPR_MLP_OPT_PB_SET,
  SPR_MLP_CONNECT_BIAS
}
 
enum  { SPR_MLP_LEFT_CHILD, SPR_MLP_RIGHT_CHILD, SPR_MLP_SHIFT_CHILD }
 tree-structured evaluation More...
 
enum  { SPR_MLP_TM_ANY, SPR_MLP_TM_WGT, SPR_MLP_TM_WGTP, SPR_MLP_TM_CG }
 

Functions

SprMLPspr_mlp_free (SprMLP *mlp)
 
int spr_mlp_write (const char *fname, SprMLP *mlp, int write_dsc)
 
SprMLPspr_mlp_read (const char *fname)
 
void spr_set_norm_const (SprNNFloatData *norm_const, SprMlpConnect *src)
 
int spr_mlp_modif (SprMLP *mlp, int action,...)
 
SprNNFloatDataspr_mlp_norm_output (SprMLP *mlp)
 Normalize the output (sum equals to 1.0). More...
 

Detailed Description

main routines and types for handling MLP's

The main routines and types for handling MLP's.

An MLP consist of different layers with some interconnection between them. Every connection (arc) between a node in the source layer and a node in the destination layer consist of a sequence of linear and non-linear functions. These sequences of functions are common to all arcs between the nodes of the two layers involved. The free parameters of the functions are however specific for every connection (with the exception of the simple 1-to-1 connection scheme between input and output which may also use shared parameters). Functions may have extra arguments which are specified between braces '()', seperated by commas. The allowed number of arguments, their type (float, integer or string) and the effect they have, depend on the function. In some cases, some of the immediate function arguments replace one or more of the free (arc specific) parameters.

The available linear and non-linear functions are:

scale
Multiply the input with a factor.
scale(a)
Multiply the input with a constant a.
bias
Add an offset to the input.
bias(c)
Add a constant c to the input.
poly1
Evaluate a*x+b
poly1(a,b)
Evaluate a*x+b, a and b being constants
poly
Evaluate a polynomial in x; the first parameter is the order of the polynomial, followed by the polynomial coefficients ordered from high to low order (x^N ... x^0).
poly(p)
Evaluate a polynomial of order p in x; the polynomial coefficients must be stored in high to low order (x^p ... x^0).
sigm(a=1)
Evaluate a sigmoide (1/(1+exp(-a*x))), with an optional scaling of the input with a.
tanh(a=1)
Evaluate a tangent hyperbolicus (tanh(a*x)), with an optional scaling of the input with a.
rop1(a=1,b=1,c=0)
Evaluate f(x*a)*b+c with f() a ratio of first order polynomials designed to mimic the behavious of tanh() – f(x)=x/(|x|+1).
rop2(a=1,b=1,c=0)
Evaluate f(x*a)*b+c with f() a ratio of second order polynomials designed to mimic the behavious of tanh() – f(x)=(x*|x|+x)/(|x|^2+|x|+1).
rop3(a=1,b=1,c=0)
Evaluate f(x*a)*b+c with f() a ratio of third order polynomials designed to mimic the behavious of tanh() – f(x)=(x^3+x*|x|+1)/(|x|^3+|x|^2+|x|+1).
sign(a=0,b=1,c=-1)
Function that outputs 1.0 if b*x >= a and outputs c otherwise; this function is not derivable and can thus not be trained!
lsigm(c=0)
Evaluate -log(1+exp(x/(1+|x|*c))).
dist2
Evaluate (a*x+b)^2, (a,b) being the trainable parameters
dist2(a)
Evaluate (a*x+b)^2, b being the trainable parameter
ndist2
Evaluate (a*x+b)^2-log(|a|+eps), (a,b) being the trainable parameters
pow(a=1)
Evaluate a*sign(x)*|x|^|p| with a a constant and p the trainable parameter, i.e. raise x to a certain power with the sign of x being preserved.
pow(a,p,c=0)
Evaluate a*sign(x)*((|x|+|c|)^p-|c|^p), with a, p and c constants
pow2(c=0,a=1)
Evaluate a*x^2+c, a and c being constants.
exp(c=0,a=1,b=1)
Evaluate exp(x*a)*b+c.
pae1(c=0,a=1,b=1)
Evaluate f(x*a)*b+c with f() an approximaption of exp() using first order polynomials – xp=max(x,0), xn=max(-x,0), f(x)=xp+1/(xn+1).
pae2(c=0,a=1,b=1)
Evaluate f(x*a)*b+c with f() an approximaption of exp() using second order polynomials – xp=max(x,0), xn=max(-x,0), f(x)=xp^2/2+xp+1/(xn^2/2+xn+1).
pae3(c=0,a=1,b=1)
Evaluate f(x*a)*b+c with f() an approximaption of exp() using third order polynomials – xp=max(x,0), xn=max(-x,0), f(x)=xp^3/6+xp^2/2+xp+1/(xn^3/6+xn^2/2+xn+1).
abs(c=0,a=1)
Evaluate a*|x|+c, a and c being constants.
clip(a=-1,b=1)
Clip the input to ithe interval [a,b].
norm
Normalize a previous scaling operation so that one obtains an inner product of the input vector (layer) with a unit vector (the parameters).
merge(layer)
Combine the values x and y read from the input layer and the layer called layer respectively into one output value as either a weighted sum (x+y*a) or as a product of both inputs after raising them to a certain power (x*sign(y)*|y|^|a|), with a a trainable parameter.
merge(layer,a,c=0)
Combine the values x and y read from the input layer and the layer called layer respectively into one output value as either a weighted sum (x+y*a) or as a product of both inputs after raising them to a certain power (x*sign(y)*((|y|+|c|)^a-|c|^a)), a and c being constants.
weight(layer)
Weight (factor or power) the input (x*w or sign(x)*|x|^|w|) with the weight w read from the layer called layer.
set_weight(c=0,a=1,b=1)
Set the weights for the two children in a tree evaluation. Input values smaller than or equal to -<-a> are mapped to a weight of 1.0 and 0.0 for the left and right sub-tree respectively. Input values larger than or equal to +b are mapped to a weight of 0.0 (left sub-tree) and 1.0 (right sub-tree). An input value of 0.0 is mapped to an equal weight of 0.5 for both the left and right sub-tree. All other values in the range [-a,b] are mapped to intermediate values using a smooth and continous curve. The parameter c must be set to a value in the range ]-1.0,1.0] and controls the smoothness of the curve around the corner points -a and b. A value if 1.0 assures a smooth transition (sigmoid-alike curve). A value close to -1.0 give rise to a very fast step-alike transition from 0.5 to 1.0 around the two corner points. The left/right sub-tree is only evaluated of the corresponding weight is non-zero.
scaleR(r=1)
Scaling, training with a regularisation cost of (0.5*w^2)*r.
biasR(r=1)
Offset, training with a regularisation cost of (0.5*w^2)*r.
poly1R(r=1)
Scaling+offset, training with a regularisation cost of (0.5*w^2)*r
expR1(r=1,c=0,a=1,b=1)
Exponent with a regularisation cost on the output values y of (0.5*(y-c)^2)*r.
expR2(r=1,c=0,a=1,b=1)
Exponent with a regularisation cost on the input values x of (0.5*x^2)*r.
sigmR1(r=1,a=1)
Sigmoide with an optional scaling of the input with a, training with a regularisation cost of (.25-dsigm(x)/dx)*r.
tanhR1(r=1,a=1)
Tanh with an optional scaling of the input with a, training with a regularisation cost of (1.0-dtanh(x)/dx)*r.
sigmR2(r=1,a=1)
Sigmoide with an optional scaling of the input with a, training with a regularisation cost of r/(1+exp(256/x^2)).
tanhR2(r=1,a=1)
Tanh with an optional scaling of the input with a, training with a regularisation cost of r/(1+exp(64/x^2)).
sigmR3(r=1,a=1)
Sigmoide with an optional scaling of the input with a, training with a regularisation cost of (0.5*x^2)*r.
tanhR3(r=1,a=1)
Tanh with an optional scaling of the input with a, training with a regularisation cost of (0.5*x^2)*r.

The available connection types are:

direct
A 1-to-1 connection. This connection type may also use shared parameters for all arcs.
norm
Divide all inputs by the 1, 2, ... inf norm of the inputs.
full
A full connection: every output node is connected with all input nodes. The results of all incomming connections are either added or multiplied.
reduced
A sparse connection between input and output layer. The individual connections are enumerated. See below for a description of the format.
tree
Identical to a full connection, only the evalutation order differs. A tree connection has a hierarchical order (binary tree) in which only one of the two descendents is evaluated, except when a point falls in a transition region. In this case, both descendents are evaluated. See below for a description of the format.

The MLP description file has the following structure:

[layers]
  Input         <nr_input_nodes>
  <layer_name>  <nr_nodes>
  Output        <nr_output_nodes>
[connections]
  <from>[+] <to> <sum/prod> <type> [ndx_file] <param_file> <functions>
  ...
[options]
  <options>

A layer thus has a unique name and a size (number of nodes). A connection is described with a source and destination layer, the combination operator (sum or prod), a connection type and a sequence of functions.
The optional '+' that may follow the name of the input layer indicates that one extra bias node must be added. For the 'sum' combination, the bias is the first node and has a value of 1. For the 'prod' combination, the bias is the last node with a value of 2.
The connection type has the following format:

<type>(<alt_opt>)

The <alt_opt> is optional and modifies the default behaviour of the connection type. The following connection types are available:

direct(shared)
one-to-one connection, optionally the parameters are shared.
full(trans)
full connection (each node to each node), optionally the parameters are stored in a transposed order (faster evaluation, conformant to the parameter layout for tree evaluation).
reduced(excl)
reduced connections (each input node connects to a selected set of output nodes), the 'excl' flag should be set when each output node has only one incomming arc.
tree(<nsd>,<buf>)
tree structured layer, optionally parts of the non selected sub-tree are evaluated also: the <nsd> option indicates that the non-selected sub-tree should be evaluated to a depth <nsd>, the <buf> option specifies a layer with weights for which any sub-tree with a non-zero weight will be evaluated.

The functions are described as follows:

<name>[<train_arg>](<extra_args>)

The train arguments <train_arg> are optional. They are specified between square brackets '[]' and consist of the letters:

T or C
Parameters that need to be (T)rained or parameters that are fixed (C)onstants.
P or B
To either (P)ropagate the error to the previous layer, or to (B)lock the error back propagation.

The extra arguments are specified between braces '()', seperated by commas. For the list allowed arguments per function, their type (float, integer or string) and the effect they have depend on the function, see above.

The index file that specifies the reduced connectivity consists of the concattenation of (-1) terminated arrays (of the type I32) listing the set of outputs for each input. For example, the following indices

[0 2 -1 ...
1 2 -1
3 -1]

describe the connections of a layer that transforms 3 inputs into 4 outputs with the following connectivity:

[1 0 0 1
0 1 0 1
0 0 1 0]

The binary tree (tree connection type) has the following properties:

The tree structure is stored as a two valued (of the type I32) tuple per node. The first value contains the (right) child information:

<has_left_child>*1 + <has_right_child>*2 + <offset_to_right_child>*4

with offset_to_right_child equal to 0 if the node does not have a right child. The second value contains the parent information:

<is_left_child>*1 + <is_right_child>*2 + <ndx_of_parent_node_base0>*4

with ndx_of_parent_node_base0, is_left_child and is_right_child equal to -1, 0 and 0 respectively for the root node.

Date
Jan 1999
Author
Kris Demuynck
Revision History:
XX/01/1999 - KD
Creation
13/04/2010 - KD
added to SPRAAK
01/10/2012 - KD
clean-up, documentation, added new functions
See Also
mlp_eval.c and mlp_train.c

TODO