Module Ann_func


module Ann_func: sig .. end
Transfer functions and error functions.


This module implements usual transfer functions (logistic, tanh, softmax), and usual error functions (sum of squares, cross-entropy), as well as their derivatives.

The error function is the assessment, over all patterns, of the differences between the networks outputs y and the target vectors t.

A transfer function (or activation function, we will not make the difference), is a function g applied to the weighted sum of a unit's inputs a_k= sum_k(w_jk * z_j) where w_jk is the weight of the connection between units j and k, in order to produce the unit's output z_k= g(a_k).

In the following code, we will denote f the tranfer function applied to units of the output layer, and g the activation function applied to hidden units. This is due to the specificity of the output layer, which must be handled differently from the others.

Types used to describe neural network's functions


type vector = float array 
val fprint_vector : out_channel -> float array -> unit
print_vector ch x prints the vector x on channel ch
type deriv_out = vector -> vector 
Type of the derivative of the transfer function of the output layer
type deriv_error = vector -> vector -> vector 
Type of the derivative of the error function
type out_choice = (deriv_out * deriv_error) option 
The derivatives of the activation function of the output layer and of the error fuction are used to compute the deltas of the output layer (see Ann_backprop, and [Bishop96], section 4.8). We have different options, depending on which function is used and which error is being minimized during the network's training.

In the out_choice type, deriv_out is the function f' mapped on the vector a, and deriv_error is the function that computes the partial derivatives of the error with respect to each a_k of the vector a.


type nn_func = {
   e : vector -> vector -> float; (*Error function.*)
   f : vector -> vector; (*Transfer function for the output layer, mapped onto the whole layer, as some transfer functions (softmax typically) require it.*)
   out_choice : out_choice; (*Option choice for the derivatives of the output transfer function and of the error function. If this option is None (usual case), the deltas of the output layer will be computed as follows: delta_k= y_k - t_k.*)
   g : int -> float -> float; (*g u will return the activation function to be applied at unit u.*)
   g' : int -> float -> float; (*g' u will return the derivative of the activation function for unit u.*)
}
Error function, tranfer functions for the hidden and output units, and their derivatives.

Transfer functions and their derivatives


val logistic : float -> float
Sigmoid exponential function logistic(x)=1/(1+exp(-x))
val softmax : float array -> float array
Softmax function (see [Bishop96], section 6.9): softmax(a_k)= exp(a_k)/sum_j(exp(a_j). Note that this function needs some information from the outputs of the other units of the same layer (sum on j), so the softmax is mapped onto the whole layer.
val deriv_logistic : float -> float
Derivative of the logistic function.
val deriv_softmax : float -> float
Derivative of the softmax function.
val deriv_tanh : float -> float
Derivative of the hyperbolic tangent (as a function of z!).

Error functions and their derivatives


val sum_of_squares : float array -> float array -> float
"Sum of squares" error function.
val cross_entropy : float array -> float array -> float
Cross-entropy error function (see [Bishop96], section 6.9).
val log_likelihood : float array -> float array -> float
Log-likelihood error function
val deriv_sum_of_squares : float array -> float array -> float array
Derivative of the sum of squares error function.
val deriv_cross_entropy : float array -> float array -> float array
Derivative of the cross-entropy error function.