You can use programming statements in PROC SURVEYPHREG to create time-dependent covariates to test the proportional hazards
assumption for complex survey data. Consider the data set mortality
from Example 93.3. The data set contains 1,891 observations from the 1992 NHANES I Epidemiologic Followup study (NHEFS) vital and tracing status.
Suppose you want to fit a proportional hazards model to this data and construct a test for the proportional hazards assumption
on gender. The following statements request a proportional hazards regression of age
on gender
and x
, where the time-dependent covariate x
is created using the programing statements. The explanatory variable x
assumes the value of the time variable age
for the male subgroup. The variable vitalstatus
is the censor indicator, and a value of 1, 4, 5, or 6 indicates a censored observation. The WEIGHT statement specifies the
sampling weight, and the CLASS statement specifies that gender
is a classification variable.
proc surveyphreg data = mortality nomcar; class gender; strata varstrata; cluster varpsu; weight sweight; model age*vitalstatus(1 4 5 6) = gender x; x = age*(gender=1); run;
Output 93.5.1 displays some summary information. The “Number of Observations,” “Censored Summary,” and “Weighted Censored Summary” tables are exactly the same as in the example discussed in Domain Analysis.
Output 93.5.1: Data Summary, Censored Summary, and Information about Variance Estimation
Number of Observations Read | 1891 |
---|---|
Number of Observations Used | 1891 |
Sum of Weights Read | 1.0298E8 |
Sum of Weights Used | 1.0298E8 |
Summary of the Number of Event and Censored Values |
|||
---|---|---|---|
Total | Event | Censored | Percent Censored |
1891 | 717 | 1174 | 62.08 |
Summary of the Weighted Number of Event and Censored Values |
|||
---|---|---|---|
Total | Event | Censored | Percent Censored |
1.0298E8 | 27650348 | 75328323 | 73.15 |
Variance Estimation | |
---|---|
Method | Taylor Series |
Missing Values | NOMCAR |
Output 93.5.2 displays the estimated regression coefficients and their standard errors. The variable gender
has two levels, and only one level is estimable. By default, PROC SURVEYPHREG estimates the first level (GENDER 1
) and assigns a zero value for the second level. The estimated regression coefficient is 1.61 with a standard error of 0.71.
The estimated regression coefficient for x
is –0.02 with a standard error of 0.01. The t statistic for x
is –1.55 with a p-value of 0.13 on 33 degrees of freedom. This test suggests that an interaction between the time variable age
and gender
is not significant. Therefore, there is little evidence of an exponential trend over time in the hazard ratio for gender
.
Output 93.5.2: Parameter Estimates
Analysis of Maximum Likelihood Estimates | ||||||
---|---|---|---|---|---|---|
Parameter | DF | Estimate | Standard Error | t Value | Pr > |t| | Hazard Ratio |
GENDER 1 | 33 | 1.605505 | 0.709269 | 2.26 | 0.0303 | 4.980 |
GENDER 2 | 33 | 0 | . | . | . | 1.000 |
x | 33 | -0.015648 | 0.010082 | -1.55 | 0.1302 | 0.984 |