There are three basic types of measures: linear correlation, rank correlation, and tail dependence. Linear correlation is given by
The linear correlation coefficient carries very limited information about the joint properties of the variables. A well-known property is that uncorrelatedness does not imply independence, while independence implies noncorrelation. In addition, there exist distinct bivariate distributions that have the same marginal distribution and the same correlation coefficient. These results suggest that caution must be used when interpreting the linear correlation.
Another statistical measure of dependence is called rank correlation, which is nonparametric. Kendall’s tau, for example,
is the covariance between the sign statistic and
, where
is an independent copy of
:
The sign function (sometimes written as sgn) is defined by
Spearman’s rho is the correlation between the transformed random variables:
The variables are transformed by their distribution functions so that the transformed variables are uniformly distributed
on . The rank correlations depend only on the copula of the random variables and are indifferent to the marginal distributions.
Like linear correlation, the rank correlations have their limitations. In particular, there are different copulas that result
in the same rank correlation.
A third measure focuses on only part of the joint properties between the variables. Tail dependence measures the dependence when both variables are at extreme values. Formally, they can be defined as the conditional probabilities of quantile exceedances. There are two types of tail dependence:
The upper tail dependence, denoted , is
when the limit exists . Here
is the quantile function (that is, the inverse of the CDF).
The lower tail dependence is defined symmetrically.
Tail dependence is hard to detect by looking at a scatter plot of realizations of two random variables. One graphical way
to detect tail dependence between two variables is by creating the chi plot of those two variables. The chi plot, as defined
in Fisher and Switzer (2001), has characteristic patterns that depend on the dependence structure between the variables. The chi plot for the random
variables and
is a scatter plot of the pairs
for each data point
.
is a measure of the distance of the data point
from the center of the data as measured by the median values of
, and
is a correlation coefficient between dichotomized values of
and
. A positive
means that
and
are either both large with respect to their median values or both small. A negative
means that
or
is large with respect to its median, whereas the other value is small. Signs of tail dependence manifest as clusters of points
that are significantly far from the
axis around
values of
1. If
and
are uncorrelated, the
values cluster around the
axis.