Correlation refers to any of a broad class of statistical relationships involving dependence. Show
Recognize the fundamental meanings of correlation and dependence. Key TakeawaysKey Points
Key Terms
Researchers often want to know how two or more variables are related. For example, is there a relationship between the grade on the second math exam a student takes and the grade on the final exam? If there is a relationship, what is it and how strong is it? As another example, your income may be determined by your education and your profession. The amount you pay a repair person for labor is often determined by an initial amount plus an hourly fee. These are all examples of a statistical factor known as correlation. Note that the type of data described in these examples is bivariate (“bi” for two variables). In reality, statisticians use multivariate data, meaning many variables. As in our previous example, your income may be determined by your education, profession, years of experience or ability. Correlation and DependenceDependence refers to any statistical relationship between two random variables or two sets of data. Correlation refers to any of a broad class of statistical relationships involving dependence. Familiar examples of dependent phenomena include the correlation between the physical statures of parents and their offspring and the correlation between the demand for a product and its price. Correlations are useful because they can indicate a predictive relationship that can be exploited in practice. For example, an electrical utility may produce less power on a mild day based on the correlation between electricity demand and weather. In this example, there is a causal relationship, because extreme weather causes people to use more electricity for heating or cooling; however, statistical dependence is not sufficient to demonstrate the presence of such a causal relationship (i.e., correlation does not imply causation). Formally, dependence refers to any situation in which random variables do not satisfy a mathematical condition of probabilistic independence. In loose usage, correlation can refer to any departure of two or more random variables from independence, but technically it refers to any of several more specialized types of relationship between mean values.
A scatter diagram is a type of mathematical diagram using Cartesian coordinates to display values for two variables in a set of data.
Demonstrate the role that scatter diagrams play in revealing correlation. Key TakeawaysKey Points
Key Terms
A scatter plot, or diagram, is a type of mathematical diagram using Cartesian coordinates to display values for two variables in a set of data. The data is displayed as a collection of points, each having the value of one variable determining the position on the horizontal axis, and the value of the other variable determining the position on the vertical axis. In the case of an experiment, a scatter plot is used when a variable exists that is below the control of the experimenter. The controlled parameter (or independent variable) is customarily plotted along the horizontal axis, while the measured (or dependent variable) is customarily plotted along the vertical axis. If no dependent variable exists, either type of variable can be plotted on either axis, and a scatter plot will illustrate only the degree of correlation (not causation) between two variables. This is the context in which we view scatter diagrams. Relevance to CorrelationA scatter plot shows the direction and strength of a relationship between the variables. A clear direction happens given one of the following:
You can determine the strength of the relationship by looking at the scatter plot and seeing how close the points are to a line, a power function, an exponential function, or to some other type of function. When you look at a scatterplot, you want to notice the overall pattern and any deviations from the pattern. The following scatterplot examples illustrate these concepts.
Trend LinesTo study the correlation between the variables, one can draw a line of best fit (known as a “trend line”). An equation for the correlation between the variables can be determined by established best-fit procedures. For a linear correlation, the best-fit procedure is known as linear regression and is guaranteed to generate a correct solution in a finite time. No universal best-fit procedure is guaranteed to generate a correct solution for arbitrary relationships. Other Uses of Scatter PlotsA scatter plot is also useful to show how two comparable data sets agree with each other. In this case, an identity line (i.e., a [latex]\text{y}=\text{x}[/latex] line or [latex]1:1[/latex] line) is often drawn as a reference. The more the two data sets agree, the more the scatters tend to concentrate in the vicinity of the identity line. If the two data sets are numerically identical, the scatters fall on the identity line exactly. One of the most powerful aspects of a scatter plot, however, is its ability to show nonlinear relationships between variables. Furthermore, if the data is represented by a mixed model of simple relationships, these relationships will be visually evident as superimposed patterns.
The correlation coefficient is a measure of the linear dependence between two variables [latex]\text{X}[/latex] and [latex]\text{Y}[/latex], giving a value between [latex]+1[/latex] and [latex]-1[/latex].
Compute Pearson’s product-moment correlation coefficient. Key TakeawaysKey Points
Key Terms
The most common coefficient of correlation is known as the Pearson product-moment correlation coefficient, or Pearson’s [latex]\text{r}[/latex]. It is a measure of the linear correlation (dependence) between two variables [latex]\text{X}[/latex] and [latex]\text{Y}[/latex], giving a value between [latex]+1[/latex] and [latex]-1[/latex]. It is widely used in the sciences as a measure of the strength of linear dependence between two variables. It was developed by Karl Pearson from a related idea introduced by Francis Galton in the 1880s. Pearson’s correlation coefficient between two variables is defined as the covariance of the two variables divided by the product of their standard deviations. The form of the definition involves a “product moment”, that is, the mean (the first moment about the origin) of the product of the mean-adjusted random variables; hence the modifier product-moment in the name. Pearson’s correlation coefficient when applied to a population is commonly represented by the Greek letter [latex]\rho[/latex] (rho) and may be referred to as the population correlation coefficient or the population Pearson correlation coefficient. Pearson’s correlation coefficient when applied to a sample is commonly represented by the letter [latex]\text{r}[/latex] and may be referred to as the sample correlation coefficient or the sample Pearson correlation coefficient. The formula for [latex]\text{r}[/latex] is as follows: [latex]\displaystyle \text{r} = \frac{\displaystyle{\frac{\sum \text{xy}}{\text{n}}} - \bar{\text{x}}\bar{\text{y}}}{\text{s}_\text{x} \text{s}_\text{y}} \left(\frac{\text{n}}{\text{n}-1}\right)[/latex] An equivalent expression gives the correlation coefficient as the mean of the products of the standard scores. Based on a sample of paired data [latex](\text{X}_\text{i}, \text{Y}_\text{i})[/latex], the sample Pearson correlation coefficient is shown in: [latex]\displaystyle \text{r} = \frac{1}{\text{n}-1} \sum_{\text{i}=1}^\text{n} \left(\frac{\text{X}_\text{i}-\bar{\text{X}}}{\text{s}_\text{X}} \right)\left(\frac{\text{Y}_\text{i}-\bar{\text{Y}}}{\text{s}_\text{Y}} \right)[/latex] Mathematical Properties
Another key mathematical property of the Pearson correlation coefficient is that it is invariant to separate changes in location and scale in the two variables. That is, we may transform [latex]\text{X}[/latex] to [latex]\text{a}+\text{bX}[/latex] and transform [latex]\text{Y}[/latex] to [latex]\text{c}+\text{dY}[/latex], where [latex]\text{a}[/latex], [latex]\text{b}[/latex], [latex]\text{c}[/latex], and [latex]\text{d}[/latex] are constants, without changing the correlation coefficient. This fact holds for both the population and sample Pearson correlation coefficients. ExampleConsider the following example data set of scores on a third exam and scores on a final exam:
To find the correlation of this data we need the summary statistics; means, standard deviations, sample size, and the sum of the product of [latex]\text{x}[/latex] and [latex]\text{y}[/latex]. To find ([latex]\text{xy}[/latex]), multiply the [latex]\text{x}[/latex] and [latex]\text{y}[/latex] in each ordered pair together then sum these products. For this problem, [latex]\sum \text{xy} = 125,500[/latex]. To find the correlation coefficient we need the mean of [latex]\text{x}[/latex], the mean of [latex]\text{y}[/latex], the standard deviation of [latex]\text{x}[/latex] and the standard deviation of [latex]\text{y}[/latex]. [latex]\text{x} = 69.1818 \\ \text{y} = 160.4545 \\ \text{s}_\text{x} = 2.85721 \\ \text{s}_\text{y} = 20.8008 \\ \sum \text{xy} = 122,500[/latex] Put the summary statistics into the correlation coefficient formula and solve for [latex]\text{r}[/latex], the correlation coefficient. [latex]\displaystyle \text{r}=\frac { \frac { 122,500 }{ 11 } -\left( 69.1818 \right) \left( 160.4545 \right) }{ \left( 2.85721 \right) \left( 20.8008 \right) } \left( \frac { 11 }{ 11-1 } \right) =0.06632[/latex]
The coefficient of determination provides a measure of how well observed outcomes are replicated by a model.
Interpret the properties of the coefficient of determination in regard to correlation. Key TakeawaysKey Points
Key Terms
The coefficient of determination (denoted [latex]\text{r}^2[/latex]) is a statistic used in the context of statistical models. Its main purpose is either the prediction of future outcomes or the testing of hypotheses on the basis of other related information. It provides a measure of how well observed outcomes are replicated by the model, as the proportion of total variation of outcomes explained by the model. Values for [latex]\text{r}^2[/latex] can be calculated for any type of predictive model, which need not have a statistical basis. The MathA data set will have observed values and modelled values, sometimes known as predicted values. The “variability” of the data set is measured through different sums of squares, such as:
The most general definition of the coefficient of determination is: [latex]\displaystyle \text{r}^2 = 1-\frac{\text{SS}_\text{err}}{\text{SS}_\text{tot}}[/latex] where [latex]\text{SS}_\text{err}[/latex] is the residual sum of squares and [latex]\text{SS}_\text{tot}[/latex] is the total sum of squares. Properties and Interpretation of [latex]\text{r}^2[/latex]The coefficient of determination is actually the square of the correlation coefficient. It is is usually stated as a percent, rather than in decimal form. In context of data, [latex]\text{r}^2[/latex] can be interpreted as follows:
So [latex]\text{r}^2[/latex] is a statistic that will give some information about the goodness of fit of a model. In regression, the [latex]\text{r}^2[/latex] coefficient of determination is a statistical measure of how well the regression line approximates the real data points. An [latex]\text{r}^2[/latex] of 1 indicates that the regression line perfectly fits the data. In many (but not all) instances where [latex]\text{r}^2[/latex] is used, the predictors are calculated by ordinary least-squares regression: that is, by minimizing [latex]\text{SS}_\text{err}[/latex]. In this case, [latex]\text{r}^2[/latex] increases as we increase the number of variables in the model. This illustrates a drawback to one possible use of [latex]\text{r}^2[/latex], where one might keep adding variables to increase the [latex]\text{r}^2[/latex] value. For example, if one is trying to predict the sales of a car model from the car’s gas mileage, price, and engine power, one can include such irrelevant factors as the first letter of the model’s name or the height of the lead engineer designing the car because the [latex]\text{r}^2[/latex] will never decrease as variables are added and will probably experience an increase due to chance alone. This leads to the alternative approach of looking at the adjusted [latex]\text{r}^2[/latex]. The explanation of this statistic is almost the same as [latex]\text{r}^2[/latex] but it penalizes the statistic as extra variables are included in the model. Note that [latex]\text{r}^2[/latex] does not indicate whether:
ExampleConsider the third exam/final exam example introduced in the previous section. The correlation coefficient is [latex]\text{r}=0.6631[/latex]. Therefore, the coefficient of determination is [latex]\text{r}^2 = 0.6631^2 = 0.4397[/latex]. The interpretation of [latex]\text{r}^2[/latex] in the context of this example is as follows. Approximately 44% of the variation (0.4397 is approximately 0.44) in the final exam grades can be explained by the variation in the grades on the third exam. Therefore approximately 56% of the variation ([latex]1-0.44=0.56[/latex]) in the final exam grades can NOT be explained by the variation in the grades on the third exam.
The trend line (line of best fit) is a line that can be drawn on a scatter diagram representing a trend in the data.
Illustrate the method of drawing a trend line and what it represents. Key TakeawaysKey Points
Key Terms
The trend line, or line of best fit, is a line that can be drawn on a scatter diagram representing a trend in the data. It tells whether a particular data set has increased or decreased over a period of time. A trend line could simply be drawn by eye through a set of data points, but more properly its position and slope are calculated using statistical techniques like linear regression. Trend lines typically are straight lines, although some variations use higher degree polynomials depending on the degree of curvature desired in the line. Trend lines are often used to argue that a particular action or event (such as training, or an advertising campaign) caused observed changes at a point in time. This is a simple technique, and does not require a control group, experimental design, or a sophisticated analysis technique. However, it suffers from a lack of scientific validity in cases where other potential changes can affect the data. The mathematical process which determines the unique line of best fit is based on what is called the method of least squares – which explains why this line is sometimes called the least squares line. This method works by:
Drawing a Trend LineThe line of best fit is drawn by:
The closeness (or otherwise) of the cloud of data points to the line suggests the concept of spread or dispersion. The graph below shows what happens when we draw the line of best fit from the first data to the last data – it does not go through the median position as there is one data above and three data below the blue line. This is a common mistake to avoid.
To determine the equation for the line of best fit:
ExampleConsider the data in the graph below:
To determine the equation for the line of best fit:
[latex]\displaystyle \text{gradient}=\frac { 1100-700 }{ 110-50 } =6.67[/latex] [latex]\displaystyle \hat { \text{Y} } =\text{A}+\left( \frac { 400 }{ 60 } \right) \text{X}[/latex]
[latex]\displaystyle 700=\text{A}+\left( \frac { 400 }{ 60 } \right) 50[/latex] [latex]\displaystyle700=\text{A}+\frac { 20,000 }{ 60 }[/latex] [latex]366.67 =\text{A}[/latex]
[latex]\hat { \text{Y} } =366.67+6.67\text{X}[/latex]
Other types of correlation coefficients include intraclass correlation and the concordance correlation coefficient.
Distinguish the intraclass and concordance correlation coefficients from previously discussed correlation coefficients. Key TakeawaysKey Points
Key Terms
The intraclass correlation (or the intraclass correlation coefficient, abbreviated ICC) is a descriptive statistic that can be used when quantitative measurements are made on units that are organized into groups. It describes how strongly units in the same group resemble each other. While it is viewed as a type of correlation, unlike most other correlation measures it operates on data structured as groups rather than data structured as paired observations. The intraclass correlation is commonly used to quantify the degree to which individuals with a fixed degree of relatedness (e.g., full siblings) resemble each other in terms of a quantitative trait. Another prominent application is the assessment of consistency or reproducibility of quantitative measurements made by different observers measuring the same quantity. The intraclass correlation can be regarded within the framework of analysis of variance (ANOVA), and more recently it has been regarded in the framework of a random effect model. Most of the estimators can be defined in terms of the random effects model in: [latex]\text{Y}_{\text{ij}} = \mu + \alpha_\text{j} + \epsilon_{\text{ij}}[/latex] where [latex]\text{Y}_{\text{ij}}[/latex] is the [latex]\text{i}[/latex]th observation in the [latex]\text{j}[/latex]th group, [latex]\mu[/latex] is an unobserved overall mean, [latex]\alpha_\text{j}[/latex] is an unobserved random effect shared by all values in group [latex]\text{j}[/latex], and [latex]\epsilon_{\text{ij}}[/latex] is an unobserved noise term. For the model to be identified, the [latex]\alpha_\text{j}[/latex] and [latex]\epsilon_{\text{ij}}[/latex] are assumed to have expected value zero and to be uncorrelated with each other. Also, the [latex]\alpha_\text{j}[/latex] are assumed to be identically distributed, and the [latex]\epsilon_{\text{ij}}[/latex] are assumed to be identically distributed. The variance of [latex]\alpha_\text{j}[/latex] is denoted [latex]\sigma_{\alpha}^2[/latex] and the variance of [latex]\epsilon_{\text{ij}}[/latex] is denoted [latex]\sigma_{\epsilon}^2[/latex]. The population ICC in this framework is shown below: [latex]\displaystyle \frac{\sigma_{\alpha}^2}{\sigma_{\alpha}^2 + \sigma_{\epsilon}^2}[/latex] Relationship to Pearson’s Correlation CoefficientOne key difference between the two statistics is that in the ICC, the data are centered and scaled using a pooled mean and standard deviation; whereas in the Pearson correlation, each variable is centered and scaled by its own mean and standard deviation. This pooled scaling for the ICC makes sense because all measurements are of the same quantity (albeit on units in different groups). For example, in a paired data set where each “pair” is a single measurement made for each of two units (e.g., weighing each twin in a pair of identical twins) rather than two different measurements for a single unit (e.g., measuring height and weight for each individual), the ICC is a more natural measure of association than Pearson’s correlation. An important property of the Pearson correlation is that it is invariant to application of separate linear transformations to the two variables being compared. Thus, if we are correlating [latex]\text{X}[/latex] and [latex]\text{Y}[/latex], where, say, [latex]\text{Y}=2\text{X}+1[/latex], the Pearson correlation between [latex]\text{X}[/latex] and [latex]\text{Y}[/latex] is 1: a perfect correlation. This property does not make sense for the ICC, since there is no basis for deciding which transformation is applied to each value in a group. However if all the data in all groups are subjected to the same linear transformation, the ICC does not change. Concordance Correlation CoefficientThe concordance correlation coefficient measures the agreement between two variables (e.g., to evaluate reproducibility or for inter-rater reliability). The formula is written as: [latex]\rho_\text{c} = \dfrac{2\rho\sigma_\text{x}\sigma_\text{y}}{\sigma_\text{x}^2+\sigma_\text{y}^2+(\mu_\text{x} - \mu_\text{y})^2}[/latex] where [latex]{ \mu }_{ \text{x} }[/latex] and [latex]{ \mu }_{ \text{y} }[/latex] are the means for the two variables and [latex]{ { \sigma }^{ 2 } }_{ \text{x} }[/latex] and [latex]{ { \sigma }^{ 2 } }_{ \text{y} }[/latex] are the corresponding variances. Relation to Other Measures of CorrelationWhereas Pearson’s correlation coefficient is immune to whether the biased or unbiased version for estimation of the variance is used, the concordance correlation coefficient is not. The concordance correlation coefficient is nearly identical to some of the measures called intraclass correlations. Comparisons of the concordance correlation coefficient with an “ordinary” intraclass correlation on different data sets will find only small differences between the two correlations.
A prediction interval is an estimate of an interval in which future observations will fall with a certain probability given what has already been observed.
Formulate a prediction interval and compare it to other types of statistical intervals. Key TakeawaysKey Points
Key Terms
In predictive inference, a prediction interval is an estimate of an interval in which future observations will fall, with a certain probability, given what has already been observed. A prediction interval bears the same relationship to a future observation that a frequentist confidence interval or Bayesian credible interval bears to an unobservable population parameter. Prediction intervals predict the distribution of individual future points, whereas confidence intervals and credible intervals of parameters predict the distribution of estimates of the true population mean or other quantity of interest that cannot be observed. Prediction intervals are also present in forecasts; however, some experts have shown that it is difficult to estimate the prediction intervals of forecasts that have contrary series. Prediction intervals are often used in regression analysis. For example, let’s say one makes the parametric assumption that the underlying distribution is a normal distribution and has a sample set [latex]\{\text{X}_1, \dots, \text{X}_\text{n}\}[/latex]. Then, confidence intervals and credible intervals may be used to estimate the population mean [latex]\mu[/latex] and population standard deviation [latex]\sigma[/latex] of the underlying population, while prediction intervals may be used to estimate the value of the next sample variable, [latex]\text{X}_{\text{n}+1}[/latex]. Alternatively, in Bayesian terms, a prediction interval can be described as a credible interval for the variable itself, rather than for a parameter of the distribution thereof. The concept of prediction intervals need not be restricted to the inference of just a single future sample value but can be extended to more complicated cases. For example, in the context of river flooding, where analyses are often based on annual values of the largest flow within the year, there may be interest in making inferences about the largest flood likely to be experienced within the next 50 years. Since prediction intervals are only concerned with past and future observations, rather than unobservable population parameters, they are advocated as a better method than confidence intervals by some statisticians. Prediction Intervals in the Normal DistributionGiven a sample from a normal distribution, whose parameters are unknown, it is possible to give prediction intervals in the frequentist sense — i.e., an interval [latex][\text{a}, \text{b}][/latex] based on statistics of the sample such that on repeated experiments, [latex]\text{X}_{\text{n}+1}[/latex] falls in the interval the desired percentage of the time. A general technique of frequentist prediction intervals is to find and compute a pivotal quantity of the observables [latex]\text{X}_1, \dots, \text{X}_\text{n}, \text{X}_{\text{n}+1}[/latex] – meaning a function of observables and parameters whose probability distribution does not depend on the parameters – that can be inverted to give a probability of the future observation [latex]\text{X}_{\text{n}+1}[/latex] falling in some interval computed in terms of the observed values so far. The usual method of constructing pivotal quantities is to take the difference of two variables that depend on location, so that location cancels out, and then take the ratio of two variables that depend on scale, so that scale cancels out. The most familiar pivotal quantity is the Student’s [latex]\text{t}[/latex]-statistic, which can be derived by this method. A prediction interval [latex][\text{l}, \text{u}][/latex] for a future observation [latex]\text{X}[/latex] in a normal distribution [latex]\text{N}(\mu, \sigma^2)[/latex] with known mean and variance may easily be calculated from the formula: [latex]\displaystyle \begin{align} \gamma&=\text{P}(\text{l}< \text{X}< \text{u}) \\ &=\text{P}\left(\frac{\text{l}-\mu}{\sigma}< \frac{\text{X}-\mu}{\sigma}< \frac{\text{u}-\mu}{\sigma}\right)\\& =\text{P}\left(\frac{\text{l}-\mu}{\sigma}< \text{Z}< \frac{\text{u}-\mu}{\sigma}\right) \end{align}[/latex] where: [latex]\displaystyle \text{Z}=\frac { \text{X}-\mu }{ \sigma }[/latex] the standard score of X, is standard normal distributed. The prediction interval is conventionally written as: [latex]\left[ \mu -\text{z}\sigma,\quad \mu +\text{z}\sigma \right][/latex] For example, to calculate the 95% prediction interval for a normal distribution with a mean ([latex]\mu[/latex]) of 5 and a standard deviation ([latex]\sigma[/latex]) of 1, then [latex]\text{z}[/latex] is approximately 2. Therefore, the lower limit of the prediction interval is approximately [latex]5 - (1\cdot2) = 3[/latex], and the upper limit is approximately 7, thus giving a prediction interval of approximately 3 to 7.
A rank correlation is a statistic used to measure the relationship between rankings of ordinal variables or different rankings of the same variable.
Define rank correlation and illustrate how it differs from linear correlation. Key TakeawaysKey Points
Key Terms
A rank correlation is any of several statistics that measure the relationship between rankings of different ordinal variables or different rankings of the same variable. In this context, a “ranking” is the assignment of the labels “first”, “second”, “third”, et cetera, to different observations of a particular variable. A rank correlation coefficient measures the degree of similarity between two rankings and can be used to assess the significance of the relation between them. If, for example, one variable is the identity of a college basketball program and another variable is the identity of a college football program, one could test for a relationship between the poll rankings of the two types of program. One could then ask, do colleges with a higher-ranked basketball program tend to have a higher-ranked football program? A rank correlation coefficient can measure that relationship, and the measure of significance of the rank correlation coefficient can show whether the measured relationship is small enough to be likely to be a coincidence. If there is only one variable—for example, the identity of a college football program—but it is subject to two different poll rankings (say, one by coaches and one by sportswriters), then the similarity of the two different polls’ rankings can be measured with a rank correlation coefficient. Rank Correlation CoefficientsRank correlation coefficients, such as Spearman’s rank correlation coefficient and Kendall’s rank correlation coefficient, measure the extent to which as one variable increases the other variable tends to increase, without requiring that increase to be represented by a linear relationship.
If as the one variable increases the other decreases, the rank correlation coefficients will be negative. It is common to regard these rank correlation coefficients as alternatives to Pearson’s coefficient, used either to reduce the amount of calculation or to make the coefficient less sensitive to non-normality in distributions. However, this view has little mathematical basis, as rank correlation coefficients measure a different type of relationship than the Pearson product-moment correlation coefficient. They are best seen as measures of a different type of association rather than as alternative measure of the population correlation coefficient. An increasing rank correlation coefficient implies increasing agreement between rankings. The coefficient is inside the interval [latex][-1, 1][/latex] and assumes the value:
Nature of Rank CorrelationTo illustrate the nature of rank correlation, and its difference from linear correlation, consider the following four pairs of numbers [latex](\text{x}, \text{y})[/latex]: [latex](0, 1) \\ (10, 100) \\ (101, 500) \\ (102, 2000)[/latex] As we go from each pair to the next pair, [latex]\text{x}[/latex] increases, and so does [latex]\text{y}[/latex]. This relationship is perfect, in the sense that an increase in [latex]\text{x}[/latex] is always accompanied by an increase in [latex]\text{y}[/latex]. This means that we have a perfect rank correlation and both Spearman’s correlation coefficient and Kendall’s correlation coefficient are 1. In this example, the Pearson product-moment correlation coefficient is 0.7544, indicating that the points are far from lying on a straight line. In the same way, if [latex]\text{y}[/latex] always decreases when [latex]\text{x}[/latex] increases, the rank correlation coefficients will be [latex]-1[/latex] while the Pearson product-moment correlation coefficient may or may not be close to [latex]-1[/latex]. This depends on how close the points are to a straight line. However, in the extreme case of perfect rank correlation, when the two coefficients are both equal (being both [latex]+1[/latex] or both [latex]-1[/latex]), this is not in general so, and values of the two coefficients cannot meaningfully be compared. For example, for the three pairs [latex](1, 1)[/latex], [latex](2, 3)[/latex], [latex](3, 2)[/latex], Spearman’s coefficient is [latex]\frac{1}{2}[/latex], while Kendall’s coefficient is [latex]\frac{1}{3}[/latex]. |