Let
be the gradient vector and the Hessian matrix, where is the log likelihood for the jth observation. With a starting value of
, the pseudo-estimate
of
is obtained iteratively until convergence is obtained:
where and
are evaluated at the ith iteration
. If the log likelihood evaluated at
is less than that evaluated at
, then
is recomputed by step-halving or ridging. The iterative scheme continues until convergence is obtained—that is, until
is sufficiently close to
. Then the maximum likelihood estimate of
is
.