# Pearson and Spearman Correlations in R

## 25 Feb Pearson and Spearman Correlations in R

In a previous post we explained the concept of correlation between two variables and specifically discussed for Spearman and Pearson correlations. Here, we will have a practical example using R language.

With the following script we create a scatter-plot derived from two columns of the mtcars dataset depicting the relationship of weight(wt) and miles per gallon(mpg).

```> x <- mtcars\$wt
> y <- mtcars\$mpg

#scatter plot
> plot(x, y, main = "wt VS mpg",
xlab = "wt", ylab = "mpg",
pch = 18, frame = FALSE)
> abline(lm(y ~ x, data = mtcars), col = "red")
```

Assumptions

Pearson correlation is a parametric correlation and can be used only when x, y come from a normal distribution. In our example, we can test this assumption using the Shapiro-Wilk normality test.

```# Shapiro-Wilk normality test for wt
> shapiro.test(x)

Shapiro-Wilk normality test

data:  x
W = 0.94326, p-value = 0.09265

# Shapiro-Wilk normality test for mpg
> shapiro.test(y)

Shapiro-Wilk normality test

data:  y
W = 0.94756, p-value = 0.1229
```

In this test, the null hypothesis is that the data come from a normal distribution. In case when p-value < 0.05 we can reject the null hypothesis and accept the alternative one. Here, for both x and y we accept the null hypothesis and x,y are normally distributed.

Correlations

Calculating the Pearson and Spearman correlations with the following lines, we have:

```#pearson
> cor(x,y,method = "pearson")
[1] -0.8676594

#spearman
> cor(x,y,method = "spearman")
[1] -0.886422```

Both of these metrics indicate strong correlation between weight and mpg variables of the mtcars dataset. Spearman has a slightly higher value since it captures the monotonic relationship and not strictly the linear relationship.

, , , , , ,