In fitting a Cox model, the phenomenon of monotone likelihood is observed if the likelihood converges to a finite value while at least one parameter diverges (Heinze and Schemper, 2001).
Let denote the vector explanatory variables for the lth individual at time t. Let denote the k distinct, ordered event times. Let denote the multiplicity of failures at ; that is, is the size of the set of individuals that fail at . Let denote the risk set just before . Let be the vector of regression parameters. The Breslow log partial likelihood is given by
|
Denote
|
Then the score function is given by
|
|
|
|
|
|
|
|
|
and the Fisher information matrix is given by
|
|
|
|
|
|
Heinze (1999); Heinze and Schemper (2001) applied the idea of Firth (1993) by maximizing the penalized partial likelihood
|
The score function is replaced by the modified score function by , where
|
The Firth estimate is obtained iteratively as
|
The covariance matrix is computed as , where is the maximum penalized partial likelihood estimate.
Denote
|
|
|
|
|
|
Then
|
|
|
|
|
|
|
|
|