Suppose that the observations to be analyzed consist of interval-censored outcomes ,
, where n is the number of subjects.
denotes a p-dimensional vector of covariates for the ith subject. This notation allows for exact event times, right-censored data and left-censored data as special cases. When
, the observation is an exact time; when
, the observation is right-censored; when
, the observation is left-censored.
Let denote the survival function for a subject whose covariate is
. Assuming that t is continuous, denote
as the density function for the subject. The hazard function for the subject,
, is defined as the instantaneous failure rate at time t. Mathematically, the hazard function is determined as a ratio between the density function and the survival function:
A quantity that is closely related to the survival function is the cumulative hazard function, defined as
In turn, the cumulative hazard function determines the survival function:
If some of the responses are left-, right-, or interval-censored, the log likelihood can be written as
where the first sum is the total of the uncensored observations, the second sum is the total of the right-censored observations, the third sum is the total of the left-censored observations, and the last sum is the total of the interval-censored observations.
For the ith subject, the proportional hazards model (Cox, 1972) assumes that
where is a p-dimensional vector of coefficients for the covariate vector
is the baseline hazard function, which is the hazard rate when all the coefficients for the covariates are equal to 0.
Under the proportional hazards model, the cumulative hazard function for the ith subject is
The survival function for the ith subject is
where denotes the baseline survival function and
The density function for the subject is obtained by differentiating the survival function:
Given these quantities, the likelihood function under the proportional hazards model can be expressed as
where the first sum is the total of the uncensored observations, the second sum is the total of the right-censored observations, the third sum is the total of the left-censored observations, and the last sum is the total of the interval-censored observations.
This likelihood function is often referred as the full likelihood as compared to the partial likelihood (Cox, 1972) because it involves parameters for the baseline hazard function in addition to the regression coefficients . The full likelihood is often used for analyzing interval-censored data because constructing a likelihood function that contains
only the regression coefficients as conveniently as the Cox partial likelihood does for right-censored data is not straightforward
(Finkelstein, 1986).