In fitting a Cox model, the phenomenon of monotone likelihood is observed if the likelihood converges to a finite value while at least one parameter diverges (Heinze and Schemper, 2001).
Let denote the vector explanatory variables for the lth individual at time t. Let
denote the k distinct, ordered event times. Let
denote the multiplicity of failures at
; that is,
is the size of the set
of individuals that fail at
. Let
denote the risk set just before
. Let
be the vector of regression parameters. The Breslow log partial likelihood is given by
![]() |
Denote
![]() |
Then the score function is given by
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
and the Fisher information matrix is given by
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
Heinze (1999); Heinze and Schemper (2001) applied the idea of Firth (1993) by maximizing the penalized partial likelihood
![]() |
The score function is replaced by the modified score function by
, where
![]() |
The Firth estimate is obtained iteratively as
![]() |
The covariance matrix is computed as
, where
is the maximum penalized partial likelihood estimate.
Denote
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
Then
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |