module Ann_func:Transfer functions and error functions.sig
..end
The error function is the assessment, over all patterns, of the differences
between the networks outputs y
and the target vectors t
.
A transfer function (or activation function, we will not make the
difference), is a function g
applied to the weighted sum of a
unit's inputs a_k= sum_k(w_jk * z_j)
where w_jk
is the weight of
the connection between units j
and k
, in order to produce the
unit's output z_k= g(a_k)
.
In the following code, we will denote f
the tranfer function applied
to units of the output layer, and g
the activation function applied
to hidden units. This is due to the specificity of the output layer,
which must be handled differently from the others.
typevector =
float array
val fprint_vector : out_channel -> float array -> unit
print_vector ch x
prints the vector x
on channel ch
typederiv_out =
vector -> vector
typederiv_error =
vector -> vector -> vector
typeout_choice =
(deriv_out * deriv_error) option
out_choice= None
when the deltas of the output layer
are computed as follows: delta_k= y_k - t_k
This is typically the case in the following cases:
f=ident
, orf=softmax
out_choice= Some (deriv_out,deriv_error)
when the deltas of the
output layer are computed with: delta_k= f'(a_k) * e'_k(y_k,t_k)
where f'
is the derivative of the output layer activation function
and e'_k
is the partial derivative of the error with respect to a_k
.
Typical examples for f'(a_k)
are:
f'(a_k)= f(a_k)*(1-f(a_k))= z_k*(1-z_k)
when f
is the logistic
function or the softmax functionf'(a_k)= 1- f(a_k)²= 1-z_k²
when f
is the hyperbolic
tangent tanh
z
are already computed, we will express f'
directly
as a function of z
: z(1-z)
or (1-z²)
In the out_choice
type, deriv_out
is the function f'
mapped
on the vector a
, and deriv_error
is the function that computes
the partial derivatives of the error with respect to each a_k
of the vector a
.
type
nn_func = {
|
e : |
(* | Error function. | *) |
|
f : |
(* | Transfer function for the output layer, mapped onto
the whole layer, as some transfer functions (softmax typically)
require it. | *) |
|
out_choice : |
(* | Option choice for the derivatives of the output transfer function
and of the error function. If this option is | *) |
|
g : |
(* |
| *) |
|
g' : |
(* |
| *) |
val logistic : float -> float
logistic(x)=1/(1+exp(-x))
val softmax : float array -> float array
softmax(a_k)= exp(a_k)/sum_j(exp(a_j)
.
Note that this function needs some information from the outputs of the
other units of the same layer (sum on j
), so the softmax is mapped onto
the whole layer.val deriv_logistic : float -> float
val deriv_softmax : float -> float
val deriv_tanh : float -> float
z
!).val sum_of_squares : float array -> float array -> float
val cross_entropy : float array -> float array -> float
val log_likelihood : float array -> float array -> float
val deriv_sum_of_squares : float array -> float array -> float array
val deriv_cross_entropy : float array -> float array -> float array