module Ann_func:Transfer functions and error functions.sig
..end
The error function is the assessment, over all patterns, of the differences
between the networks outputs y
and the target vectors t
.
A transfer function (or activation function, we will not make the
difference), is a function g
applied to the weighted sum of a
unit's inputs a_k= sum_k(w_jk * z_j)
where w_jk
is the weight of
the connection between units j
and k
, in order to produce the
unit's output z_k= g(a_k)
.
In the following code, we will denote f
the tranfer function applied
to units of the output layer, and g
the activation function applied
to hidden units. This is due to the specificity of the output layer,
which must be handled differently from the others.
typevector =
float array
val fprint_vector : out_channel > float array > unit
print_vector ch x
prints the vector x
on channel ch
typederiv_out =
vector > vector
typederiv_error =
vector > vector > vector
typeout_choice =
(deriv_out * deriv_error) option
out_choice= None
when the deltas of the output layer
are computed as follows: delta_k= y_k  t_k
This is typically the case in the following cases:
f=ident
, orf=softmax
out_choice= Some (deriv_out,deriv_error)
when the deltas of the
output layer are computed with: delta_k= f'(a_k) * e'_k(y_k,t_k)
where f'
is the derivative of the output layer activation function
and e'_k
is the partial derivative of the error with respect to a_k
.
Typical examples for f'(a_k)
are:
f'(a_k)= f(a_k)*(1f(a_k))= z_k*(1z_k)
when f
is the logistic
function or the softmax functionf'(a_k)= 1 f(a_k)²= 1z_k²
when f
is the hyperbolic
tangent tanh
z
are already computed, we will express f'
directly
as a function of z
: z(1z)
or (1z²)
In the out_choice
type, deriv_out
is the function f'
mapped
on the vector a
, and deriv_error
is the function that computes
the partial derivatives of the error with respect to each a_k
of the vector a
.
type
nn_func = {

e : 
(*  Error function.  *) 

f : 
(*  Transfer function for the output layer, mapped onto
the whole layer, as some transfer functions (softmax typically)
require it.  *) 

out_choice : 
(*  Option choice for the derivatives of the output transfer function
and of the error function. If this option is  *) 

g : 
(* 
 *) 

g' : 
(* 
 *) 
val logistic : float > float
logistic(x)=1/(1+exp(x))
val softmax : float array > float array
softmax(a_k)= exp(a_k)/sum_j(exp(a_j)
.
Note that this function needs some information from the outputs of the
other units of the same layer (sum on j
), so the softmax is mapped onto
the whole layer.val deriv_logistic : float > float
val deriv_softmax : float > float
val deriv_tanh : float > float
z
!).val sum_of_squares : float array > float array > float
val cross_entropy : float array > float array > float
val log_likelihood : float array > float array > float
val deriv_sum_of_squares : float array > float array > float array
val deriv_cross_entropy : float array > float array > float array