This example uses computer-generated data to mimic a Hertzsprung-Russell plot (Struve and Zebergs, 1962, p. 259) of the temperature and luminosity of stars. The data are plotted and displayed in Output 66.4.1. It appears that there are two main groups of stars and a collection of isolated stars. The long straggling group of points appearing diagonally across the figure represents the main group of stars; the more compact group in the top-right corner contains giant stars. The JOIN= option is specified at a 0.05 significance level with various smoothing parameters. The CK=5 option is specified in order to prevent the numerous outliers from forming separate clusters. The results from PROC MODECLUS is displayed in Output 66.4.2. The cluster memberships are then plotted by PROC SGPLOT, as displayed in Output 66.4.3 through Output 66.4.5.
Note that the graphic output from PROC SGPLOT in Output 66.4.3 is not available when _R_ = 2.5 because only one cluster remains after joining at a 5% significance level, and the results are not written to the OUT= data set. See the description of the JOIN= option). for more information.
The following statements produce Output 66.4.1 through Output 66.4.5:
title 'Hertzsprung-Russell Plot of Visible Stars'; title2 'Computer-Generated Simulated Data'; data hr; input x y @@; label x='-Temperature' y='-Luminosity'; datalines; 1.0 12.8 0.9 13.7 0.9 12.9 1.0 12.3 1.0 12.2 2.6 10.9 2.4 10.9 2.5 11.2 2.3 11.5 2.6 12.0 2.4 12.1 2.3 10.9 2.6 11.5 2.5 11.9 2.4 11.0 3.4 11.1 3.3 11.2 3.4 11.1 3.4 9.9 3.2 10.4 3.5 10.8 3.4 11.0 3.3 11.2 3.3 10.8 3.5 10.0 3.5 10.2 3.4 10.2 3.6 10.6 3.7 10.4 3.7 10.1 3.4 10.7 3.4 10.8 3.3 11.0 3.6 10.8 3.5 10.1 4.5 10.3 4.6 9.4 4.3 10.3 4.6 9.4 4.4 9.9 4.5 10.4 4.4 9.9 4.6 9.4 4.4 10.7 4.4 9.3 4.4 9.5 4.1 10.6 4.4 10.6 4.5 10.3 4.4 10.0 4.2 9.8 4.5 9.5 4.2 13.4 4.6 10.4 4.5 9.8 5.8 8.8 5.6 8.4 5.6 13.9 5.7 9.5 5.6 14.5 5.6 9.2 5.7 8.7 5.7 9.4 5.7 9.3 5.6 9.4 5.8 9.8 5.5 8.8 5.8 8.9 5.7 9.4 5.6 12.1 5.4 10.1 5.8 9.3 5.9 9.0 5.7 10.0 5.6 9.3 6.6 8.6 6.7 8.5 6.7 12.5 ... more lines ... 26.4 14.1 26.6 14.2 27.5 13.7 27.6 14.4 27.8 14.0 27.4 14.7 25.8 13.5 25.6 13.6 26.8 14.4 26.4 19.0 26.0 13.4 27.3 14.0 27.5 14.3 27.4 14.5 26.3 13.8 26.9 13.7 26.3 13.7 27.7 14.3 27.3 14.1 28.3 14.2 17.4 15.5 13.8 15.2 12.0 11.6 14.1 12.8 17.1 10.2 16.9 15.4 18.5 12.6 14.2 16.1 23.2 6.6 11.4 12.4 20.4 11.7 20.9 8.1 18.9 13.7 16.9 9.7 15.5 9.9 18.3 14.2 19.3 13.7 17.0 12.9 10.1 11.6 17.9 13.5 14.3 1.4 13.1 -0.8 8.1 -0.9 20.0 7.0 21.0 8.5 15.6 13.2 ;
proc sgplot data=hr; scatter y=y x=x; run;
proc modeclus data=hr m=1 r=1 1.5 2 2.5 ck=5 join=.05 short out=out; run; title2 'MODECLUS Analysis'; proc sgplot data=out; scatter y=y x=x/group=cluster; by _R_; run;