The PHREG Procedure

Time and CLASS Variables Usage

The following DATA step creates an artificial data set, Test, to be used in this section. There are four variables in Test: the variable T contains the failure times; the variable Status is the censoring indicator variable with the value 1 for an uncensored failure time and the value 0 for a censored time; the variable A is a categorical variable with values 1, 2, and 3 representing three different categories; and the variable MirrorT is an exact copy of T.

data Test;
   input T Status A @@;
   MirrorT = T;
   datalines;
 23        1      1    7        0      1
 23        1      1   10        1      1
 20        0      1   13        0      1
 24        1      1   10        1      1
 18        1      2    6        1      2
 18        0      2    6        1      2
 13        0      2   13        1      2
  9        0      2   15        1      2
  8        1      3    6        1      3
 12        0      3    4        1      3
 11        1      3    8        1      1
  6        1      3    7        1      3
  7        1      3   12        1      3
  9        1      2   15        1      2
  3        1      2   14        0      3
  6        1      1   13        1      2
;

Time Variable on the Right Side of the MODEL Statement

When the time variable is explicitly used in an explanatory effect in the MODEL statement, the effect is not time-dependent. In the following specification, T is the time variable, but T does not play the role of the time variable in the explanatory effect T*A:

proc phreg data=Test;
   class A;
   model T*Status(0)=T*A;
run;

The parameter estimates of this model are shown in Figure 67.12.

Figure 67.12: T*A Effect

The PHREG Procedure

Analysis of Maximum Likelihood Estimates
Parameter   DF Parameter
Estimate
Standard
Error
Chi-Square Pr > ChiSq Hazard
Ratio
Label
T*A 1 1 -0.16549 0.05042 10.7734 0.0010 . A 1 * T
T*A 2 1 -0.11852 0.04181 8.0344 0.0046 . A 2 * T


To verify that the effect T*A in the MODEL statement is not time-dependent, T is replaced by MirrorT, which is an exact copy of T, as in the following statements:

proc phreg data=Test;
   class A;
   model T*Status(0)=A*MirrorT;
run;

The results of fitting this model (Figure 67.13) are identical to those of the previous model (Figure 67.12), except for the parameter names and labels. The effect A*MirrorT is not time-dependent, so neither is A*T.

Figure 67.13: T*A Effect

The PHREG Procedure

Analysis of Maximum Likelihood Estimates
Parameter   DF Parameter
Estimate
Standard
Error
Chi-Square Pr > ChiSq Hazard
Ratio
Label
MirrorT*A 1 1 -0.16549 0.05042 10.7734 0.0010 . A 1 * MirrorT
MirrorT*A 2 1 -0.11852 0.04181 8.0344 0.0046 . A 2 * MirrorT


CLASS Variables and Programming Statements

In PROC PHREG, the levels of CLASS variables are determined by the CLASS statement and the input data and are not affected by user-supplied programming statements. Consider the following statements, which produce the results in Figure 67.14. Variable A is declared as a CLASS variable in the CLASS statement. By default, the reference parameterization is used with A=3 as the reference level. Two regression coefficients are estimated for the two dummy variables of A.

proc phreg data=Test;
   class A;
   model T*Status(0)=A;
run;

Figure 67.14 shows the dummy variables of A and the regression coefficients estimates.

Figure 67.14: Design Variable and Regression Coefficient Estimates

The PHREG Procedure

Class Level
Information
Class Value Design Variables
A 1 1 0
  2 0 1
  3 0 0

Analysis of Maximum Likelihood Estimates
Parameter   DF Parameter
Estimate
Standard
Error
Chi-Square Pr > ChiSq Hazard
Ratio
Label
A 1 1 -1.40925 0.64802 4.7293 0.0297 0.244 A 1
A 2 1 -0.65705 0.51764 1.6112 0.2043 0.518 A 2


Now consider the programming statement that attempts to change the value of the CLASS variable A as in the following specification:

proc phreg data=Test;
   class A;
   model T*Status(0)=A;
   if A=3 then A=2;
run;

Results of this analysis are shown in Figure 67.15 and are identical to those in Figure 67.14. The if A=3 then A=2 programming statement has no effects on the design variables for A, which have already been determined.

Figure 67.15: Design Variable and Regression Coefficient Estimates

The PHREG Procedure

Class Level
Information
Class Value Design Variables
A 1 1 0
  2 0 1
  3 0 0

Analysis of Maximum Likelihood Estimates
Parameter   DF Parameter
Estimate
Standard
Error
Chi-Square Pr > ChiSq Hazard
Ratio
Label
A 1 1 -1.40925 0.64802 4.7293 0.0297 0.244 A 1
A 2 1 -0.65705 0.51764 1.6112 0.2043 0.518 A 2


Additionally any variable used in a programming statement that has already been declared in the CLASS statement is not treated as a collection of the corresponding design variables. Consider the following statements:

proc phreg data=Test;
   class A;
   model T*Status(0)=A X;
   X=T*A;
run;

The CLASS variable A generates two design variables as explanatory variables. The variable X created by the X=T*A programming statement is a single time-dependent covariate whose values are evaluated using the exact values of A given in the data, not the dummy-coded values that represent the levels of A. In data set Test, A assumes the values of 1, 2, and 3, and these are the exact values that are used in producing X. If A were a character variable with values 'Bird', 'Cat', and 'Dog', the programming statement X=T*A would have produced an error in the attempt to multiply a number with a character value.

Figure 67.16: Single Time-Dependent Variable X*A

The PHREG Procedure

Analysis of Maximum Likelihood Estimates
Parameter   DF Parameter
Estimate
Standard
Error
Chi-Square Pr > ChiSq Hazard
Ratio
Label
A 1 1 0.15798 1.69338 0.0087 0.9257 1.171 A 1
A 2 1 0.00898 0.87573 0.0001 0.9918 1.009 A 2
X   1 0.09268 0.09535 0.9448 0.3311 1.097  


To generalize the simple test of proportional hazard assumption for the design variables of A (as in the section the Classical Method of Maximum Likelihood), you specify the following statements, which are not the same as in the preceding program or as in the specification in the section Time Variable on the Right Side of the MODEL Statement:

proc phreg data=Test;
   class A;
   model T*Status(0)=A X1 X2;
   X1= T*(A=1);
   X2= T*(A=2);
run;

The Boolean parenthetical expressions (A=1) and (A=2) resolve to a value of 1 or 0, depending on whether the expression is true or false, respectively.

Results of this test are shown in Figure 67.17.

Figure 67.17: Simple Test of Proportional Hazards Assumption

The PHREG Procedure

Analysis of Maximum Likelihood Estimates
Parameter   DF Parameter
Estimate
Standard
Error
Chi-Square Pr > ChiSq Hazard
Ratio
Label
A 1 1 -0.00766 1.69435 0.0000 0.9964 0.992 A 1
A 2 1 -0.88132 1.64298 0.2877 0.5917 0.414 A 2
X1   1 -0.15522 0.20174 0.5920 0.4417 0.856  
X2   1 0.01155 0.18858 0.0037 0.9512 1.012  


In general, when your model contains a categorical explanatory variable that is time-dependent, it might be necessary to use hardcoded dummy variables to represent the categories of the categorical variable. Alternatively, you might consider using the counting-process style of input where you break up the covariate history of an individual into a number of records with nonoverlapping start and stop times and declare the categorical time-dependent variable in the CLASS statement.