# Module Bfgs

`module Bfgs: `sig` .. `end``
BFGS (Broyden-Fletcher-Goldfarb-Shanno) quasi-Newton local optimization method.

The pure Newton method uses a local quadratic approximation of the objective function `f` being minimized: `f(x)=f(x0)+ t(x-x0)G + t(x-x0)H(x-x0)` where `G` and `H` are respectively the gradient vector and the hessian matrix evaluated at `x0`, and `t` is the transpose operator. When this approximation is valid, and when the function is locally convex, the minimum is found at `x` when `G + H(x-x0) = 0`. This gives us the Newton step `x= x0 - inv(H)G`, where `inv(H)` is the inverse hessian.

Quasi-Newton methods rely on an iterative approximation of the inverse hessian, usually starting with the identity matrix as an initial approximation (which means that we follow the gradient descent direction). At each step `k` of the algorithm, the function is minimized along the search direction. Let us call `lambda` the step size along the quasi-Newton direction, and `h(k)` the approximation of the INVERSE hessian at point `x(k)`.

The algorithm is the following. At each step `k`:

1) minimize the function along the quasi-Newton direction. The line search method minimizes `f(x(k)-lambda.h(k)G)` as a function of `lambda`.

2) make a step in the quasi-Newton direction. If the minimum along that direction is found at `lambda_k`, the quasi-Newton step is then `x(k+1)= x(k) - lambda_k.h(k)G`

3) compute a new approximation `h(k+1)` of the inverse hessian. Several expressions can be found for this update. We used the analytical expression: `h(k+1)= t(I-yk.t(sk)/t(yk).sk).h(k).(I-yk.t(sk)/t(yk).sk) - sk.t(sk)/t(yk).sk` where `yk= G(k+1) - G(k)` and `sk= x(k+1) - x(k)`

The search stops when the value of the objective function falls beyond and absolute tolerance, or when it cannot be improved anymore.

Now that we have presented the main features of the algorithm, let us talk about a few additional subtleties. Three line search methods have been implemented: backracking, golden section search, and Brent's parabolic interpolation.

The backtracking starts with an initial step (default size is 1) along the quasi-Newton direction.

`val verbose : `bool ref``
Verbose printing on standard output.
`val logpoints : `string ref``
Filename used to log the points explored at each step of the BFGS search.

`type linesearch_method = `
 `|` `Backtracking` `(*` ```RECOMMENDED. Line search with a backtracking method, starting with an initial step (default 1) in a descent direction, and reducing the step until the objective function is sufficiently improved (Armijo-Goldstein condition to ensure convergence).``` `*)` `|` `Golden` `(*` `Line search using golden section method` `*)` `|` `Brent`
`val linesearch_method : `linesearch_method ref``
Line search along the quasi-Newton direction. Currently, there are three possible methods: backtracking (recommended), golden section search, Brent's parabolic interpolation.

`bt_acctol`, `bt_reduc`, and `bt_epsilon` are parameters for the backtracking method. Backtracking constists in trying a candidate step along the search direction. If the objective function is sufficiently decreased, then the step is accepted. The decrease must be at least equal to `!bt_acctol` multiplied by the norm of the gradient projection along the search direction (simplified Armijo-Goldstein condition). Otherwise, if the current step is not accepted, a new step is tried by multiplying the current step by a reduction factor `!bt_reduc`. The search stops when the step size falls under `!bt_epsilon`.

`ls_abstol`, `ls_reltol`, and `ls_max_iter` are parameters for golden section search and Brent's parabolic interpolation. For these methods, the search stops when the range of the search interval falls under an absolute tolerance `!ls_abstol`, or when it cannot be significantly reduced (relative tolerance `!ls_reltol`). If none of these criteria is met, the search stops after `!ls_max_iter` iterations.

`val bt_acctol : `float ref``
Tolerance for accepting the bactracking step.
`val bt_reduc : `float ref``
Backtracking step size reduction factor.
`val bt_epsilon : `float ref``
Backtracking stops when steps size along the search direction falls under `!bt_epsilon`.
`val ls_max_iter : `int ref``
Maximum number of iterations for the line search (Golden section search or Brent's method).
`val ls_abstol : `float ref``
Absolute tolerance for the stopping criterion (Golden section search or Brent's method).
`val ls_reltol : `float ref``
Relative tolerance for the stopping criterion (Golden section search or Brent's method).
`val step : `float ref``
Initial step length for the line search
`val bfgs : `(float array -> float) ->       (float array -> float array) ->       float array ->       int ->       float ->       float -> int -> (int -> bool) -> int * float array * float * float array``
`bfgs f g xinit max_iter abstol reltol freq_verbose debug` returns a tuple `(i,x,gx,hx)` where `i` is the iteration at which the BFGS search stopped, `x` is the best local minimum found for the function `f`, `gx` is the value of the gradient `g` at this point, and `hx` is the approximation of the inverse hessian at the minimum.

The search of a minimum stops when the maximum number of iterations is reached or when two successive values `f1` and `f2` of the objective function are closer than the absolute tolerance `abstol`, or when the relative difference between the two values falls under the relative tolerance `reltol`.

The exact formulation of the two last conditions is: `|f2-f1|< abstol` or `|f2-f1|< reltol*(|f1|+reltol)`. We add `reltol` to `|f1|` to take account of the case when `f1` is zero).

`xinit` is the initial point at which the search starts, `freq_verbose` is the frequency at which some information is printed on the standard output. `debug` is a function of the iteration number `i`. Outputs for debugging are printed when `debug i` is true.