module Ann_config:Parsing the program parameters from the command line. The program may be runned in several modes (sig
..end
Train
, Test
, Run
, or
Predict
) with different options and arguments.
This module details the global variables
used as program parameters, and the functions used to parse the
command line and to set the program parameters to the chosen values.
anniml
configurationtype
mode =
| |
Train |
(* | Train the network on a patterns file. | *) |
| |
Test |
(* | Test the trained network on a patterns file. | *) |
| |
Run |
(* | Run the trained network on a new input vector passed as argument. | *) |
| |
Predict |
(* | Make predictions on a new input data file. | *) |
type
opti =
| |
Gradient_basic of |
(* | Basic steepest descent with step | *) |
| |
Gradient_momentum of |
(* | Gradient descent with step | *) |
| |
Bfgs |
(* | Broyden-Fletcher-Goldfarb-Shanno (BFGS) quasi-Newton
local optimization method (see documentation of the MATH/BFGS
module). Section 7.10 of [Bishop96] (p.287) also provides an
overview of quasi-Newton methods and BFGS, although it is not
sufficient for an efficient implementation of BFGS. | *) |
type
gradient_learning =
| |
Batch |
(* | All patterns are used to compute the gradient. | *) |
| |
On_line |
(* | Only one pattern is used at a time to compute the gradient,
and make a descending step. | *) |
| |
Chunk of |
(* | Patterns are taken by blocks (chunks) of chosen size. | *) |
anniml
programval raw_layers : int array ref
val dir : string ref
val fpatterns : string ref
x
, and second the target vector t
.
Be cautious to use a network's topology that is consistent with your
patterns file (dimension of x
must be equal to the number of input
units, and dimension of target vector t
must be equal to the number
of output units).
There is an option allowing to select the columns that you really
want to use in your patterns file.
A few examples of pattern files can be found in the examples/ directory.
val columns : int list option ref
-c
option which takes a file as argument. The column file may contain
a list of integer, or expressions like 2 6:35
, which mean that
column 2 and columns 6 to 35 will be used to build the input vectors.val finputs : string ref
-predict
option is selected by the user,
the program will make predictions on the input vectors of this file.
Option -p
with a filename as argument allows to choose the file
where the results are saved.
If -p
option is not used, results are saved by default in a new file
with the same root as finputs
but with a .pred extension.val fpredict : string ref
-predict
option
is used. fpredict
is updated with the -p
option when parsing the
command line.
If -p
option is not used, fpredict
will reference a filename
with the same root as finputs
but with a .pred extension.val fwts : string ref
-w
option.val fnet : string ref
val max_iter : int ref
abstol
for details on the
stop criterionval abstol : float ref
e1
and e2
of the error are closer than the
absolute tolerance abstol
, or when the relative difference between
the two values falls under the relative tolerance reltol
.
The exact formulation of the two last conditions is:
|e2-e1|< abstol
or |e2-e1|< reltol*(|e1|+reltol)
.
We add reltol
to |e1|
to take account of the case when
e1
is zero).
val reltol : float ref
abstol
for details
on the stop criterion.val eta : float ref
eta
.
The optimal value of the parameter depends on the problem. The default
value for etat is 0.2.val mu : float ref
mu
is the momentum parameter of the gradient descent with momentum
method. It must be chosen between 0 and 1.val opti : opti ref
val learning : gradient_learning ref
learning
allows to choose how many patterns are used to compute the
gradient of the error, before making a step in the descent direction
(in the weights space). This choice is specific to the gradient descent
methods. The options are: Batch
when all patterns are used, On_line
when only one pattern is used, or Chunk n
when a block of n
patterns
is used to compute the gradient.val rand : int ref
val norm : bool ref
-norm
option. Normalization is made by removing
the average value and dividing by the standard deviation.
However, this normalization operates only on the pattern file given as
argument of the command line. So you may not normalize the same way on
your training set and your test set.
You should better pre-process your initial data set, and normalize the
inputs before splitting it into a training set and a test set.
Whatever method is used, please do not forget to normalize the inputs
of the neural network before training it.
val mode : mode ref
Train
, Test
, Run
, or
Predict
. Default is Train
. The choice can be changed with adequate
options in the command line.val freq_verbose : int ref
-fv
option.val print_class : bool ref
-prc
option
to print classification results.val int_args : int list ref
-train
option). These arguments should then be the number
of units in each layer of a fully connected network.val float_args : float list ref
-run
option).val parse_arguments : unit -> unit
parse_arguments ()
parses the arguments and the options of
the command line.val fprint_parameters : out_channel -> unit
fprint_parameters ch
prints the global variables used as program
parameters on channel ch
.val get_functions : unit -> Ann_func.nn_func
get_functions ()
returns the neural network's functions chosen by the
user (command line options).