Two types of ellipses
can be computed for the input data (where observations correspond
to points in a scatter plot). One is a confidence ellipse for the
population mean (TYPE=MEAN), and the other is a prediction ellipse
for a new observation (TYPE=PREDICT). Both assume a bivariate normal
distribution.
Let
data:image/s3,"s3://crabby-images/00989/009890d44d3315b43ef0f839b4ee848cb5d51416" alt=""
and
data:image/s3,"s3://crabby-images/4e8f2/4e8f21d48b1af5980fa5baea854001e0bd5d5836" alt=""
be the sample mean and sample covariance matrix
of a random sample of size
n from a bivariate normal distribution with mean
data:image/s3,"s3://crabby-images/48a65/48a65bc4affaaa4629ba5ac70bdbc9afa1eb6f50" alt=""
and covariance matrix
data:image/s3,"s3://crabby-images/09ca1/09ca12819e9e782fe81c66079adfbacc011509e2" alt=""
. The variable
data:image/s3,"s3://crabby-images/a0ad6/a0ad67eae379a7160524f23c71ff2021eb73d13a" alt=""
is distributed as a bivariate normal variate with
mean zero and covariance
data:image/s3,"s3://crabby-images/74d32/74d321967ecce927dc6ca52dfcde1593f5547649" alt=""
, and it is independent of
data:image/s3,"s3://crabby-images/4e8f2/4e8f21d48b1af5980fa5baea854001e0bd5d5836" alt=""
. Using Hotelling’s
data:image/s3,"s3://crabby-images/87116/8711654c9d9c4a9e923246e6d69033361388dda1" alt=""
statistic, which is defined as
a
data:image/s3,"s3://crabby-images/71521/715212abe4d7ed08581b984efc3adfc7234fec67" alt=""
confidence ellipse for
data:image/s3,"s3://crabby-images/48a65/48a65bc4affaaa4629ba5ac70bdbc9afa1eb6f50" alt=""
is computed from the equation
where
data:image/s3,"s3://crabby-images/9b658/9b658092366cffbd8528ed932796e34f087fbf20" alt=""
is the
data:image/s3,"s3://crabby-images/6a101/6a10192c50eb1c1c161afe9803747cf81516715a" alt=""
critical value of an
data:image/s3,"s3://crabby-images/4d826/4d82658f0c2ad0780c0960a4bc7a0d1219e1889d" alt=""
distribution with degrees of freedom 2 and
data:image/s3,"s3://crabby-images/41bc0/41bc0e03a479b203c3cb05fdd85e0e010f25de77" alt=""
.
A prediction ellipse
is a region for predicting a new observation in the population. It
also approximates a region containing a specified percentage of the
population.
Denote a new observation
as the bivariate random variable
data:image/s3,"s3://crabby-images/3e9e0/3e9e07649bd6215273f66ecb76bb0e1f01a0ac97" alt=""
. The variable
is distributed as a
bivariate normal variate with mean zero (the zero vector) and covariance
data:image/s3,"s3://crabby-images/5bde1/5bde1875035571ef11fbf0b7c8a886373cdbd72d" alt=""
, and it is independent of
data:image/s3,"s3://crabby-images/4e8f2/4e8f21d48b1af5980fa5baea854001e0bd5d5836" alt=""
. A
data:image/s3,"s3://crabby-images/71521/715212abe4d7ed08581b984efc3adfc7234fec67" alt=""
prediction ellipse is then given by the equation
The family of ellipses
generated by different critical values of the
data:image/s3,"s3://crabby-images/4d826/4d82658f0c2ad0780c0960a4bc7a0d1219e1889d" alt=""
distribution has a common center (the sample mean)
and common major and minor axis directions.
The shape of an ellipse
depends on the aspect ratio of the plot. The ellipse indicates the
correlation between the two variables if the variables are standardized
(by dividing the variables by their respective standard deviations).
In this situation, the ratio between the major and minor axis lengths
is
In particular, if
data:image/s3,"s3://crabby-images/48ceb/48ceb02418763512abc92467e18428f23a52a609" alt=""
, the ratio is 1, which corresponds to a circular
confidence contour and indicates that the variables are uncorrelated.
A larger value of the ratio indicates a larger positive or negative
correlation between the variables.