`correlations.Rmd`

TOSTER has a few different functions to calculate correlations. All
the included functions are based on a few papers by Goertzen and Cribbie (2010)
(`z_cor_test`

& `compare_cor`

), and Wilcox (2011) (`boot_cor_test`

)^{1}.

Simple tests of association can be accomplished with the
`z_cor_test`

function. This function was stylized after the
`cor.test`

function, but you will notice that the results may
differ. This is caused by fact that `z_cor_test`

uses
Fisher’s z transformation as the basis for all significance tests (i.e.,
p-values). However, notice that the confidence intervals are the
same.

```
##
## Pearson's product-moment correlation
##
## data: mtcars$mpg and mtcars$qsec
## t = 2.5252, df = 30, p-value = 0.01708
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
## 0.08195487 0.66961864
## sample estimates:
## cor
## 0.418684
```

```
z_cor_test(mtcars$mpg,
mtcars$qsec)
```

```
##
## Pearson's product-moment correlation
##
## data: mtcars$mpg and mtcars$qsec
## z = 2.4023, N = 32, p-value = 0.01629
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
## 0.08195487 0.66961864
## sample estimates:
## cor
## 0.418684
```

But, just as `cor.test`

, the Spearman and Kendall
correlation coefficients can be estimated.

```
z_cor_test(mtcars$mpg,
mtcars$qsec,
method = "spear") # Don't need to spell full name
```

```
##
## Spearman's rank correlation rho
##
## data: mtcars$mpg and mtcars$qsec
## z = 2.6474, N = 32, p-value = 0.008111
## alternative hypothesis: true rho is not equal to 0
## 95 percent confidence interval:
## 0.1306771 0.7068501
## sample estimates:
## rho
## 0.4669358
```

```
z_cor_test(mtcars$mpg,
mtcars$qsec,
method = "kendall")
```

```
##
## Kendall's rank correlation tau
##
## data: mtcars$mpg and mtcars$qsec
## z = 2.6134, N = 32, p-value = 0.008964
## alternative hypothesis: true tau is not equal to 0
## 95 percent confidence interval:
## 0.08145572 0.51634821
## sample estimates:
## tau
## 0.3153652
```

The main advantage of `z_cor_test`

is that it can perform
equivalence testing (TOST), or any hypothesis test where the null isn’t
zero.

```
z_cor_test(mtcars$mpg,
mtcars$qsec,
alternative = "e", # e for equivalence
null = .4)
```

```
##
## Pearson's product-moment correlation
##
## data: mtcars$mpg and mtcars$qsec
## z = 0.12088, N = 32, p-value = 0.5481
## alternative hypothesis: equivalence
## null values:
## correlation correlation
## 0.4 -0.4
## 90 percent confidence interval:
## 0.1397334 0.6360650
## sample estimates:
## cor
## 0.418684
```

If you only have the summary statistics you perform the same tests. Just imagine you are reviewing a study with an observed correlation of 0.121 with a sample size of 105 paired observations. You could then perform an equivalence test with the following code.

```
corsum_test(r = .121,
n = 105,
alternative = "e",
null = .4)
```

```
##
## Pearson's product-moment correlation
##
## data: x and y
## z = -3.0506, N = 105, p-value = 0.001142
## alternative hypothesis: equivalence
## null values:
## correlation correlation
## 0.4 -0.4
## 90 percent confidence interval:
## -0.0412456 0.2770284
## sample estimates:
## cor
## 0.121
```

If the raw data is available, I would *strongly* recommend
using the bootstrapping function which should be more robust than the
Fisher’s z based function. Further, the `boot_cor_test`

function also has 2 other correlations that can be estimated: a
Winsorized correlation and the percentage bend correlation. The input
for the function is fairly similar to the `z_cor_test`

function.

```
set.seed(993)
boot_cor_test(mtcars$mpg,
mtcars$qsec,
alternative = "e",
null = .4)
```

```
##
## Bootstrapped Pearson's product-moment correlation
##
## data: mtcars$mpg and mtcars$qsec
## N = 32, p-value = 0.6088
## alternative hypothesis: equivalence
## null values:
## correlation correlation
## 0.4 -0.4
## 90 percent confidence interval:
## 0.2445273 0.5848411
## sample estimates:
## cor
## 0.418684
```

```
boot_cor_test(mtcars$mpg,
mtcars$qsec,
method = "spear",
alternative = "e",
null = .4)
```

```
##
## Bootstrapped Spearman's rank correlation rho
##
## data: mtcars$mpg and mtcars$qsec
## N = 32, p-value = 0.6713
## alternative hypothesis: equivalence
## null values:
## rho rho
## 0.4 -0.4
## 90 percent confidence interval:
## 0.1983190 0.6656253
## sample estimates:
## rho
## 0.4669358
```

```
boot_cor_test(mtcars$mpg,
mtcars$qsec,
method = "ken",
alternative = "e",
null = .4)
```

```
##
## Bootstrapped Kendall's rank correlation tau
##
## data: mtcars$mpg and mtcars$qsec
## N = 32, p-value = 0.2276
## alternative hypothesis: equivalence
## null values:
## tau tau
## 0.4 -0.4
## 90 percent confidence interval:
## 0.1217169 0.4864510
## sample estimates:
## tau
## 0.3153652
```

Robust correlations, such as a winsorized correlation coefficient or percentage bend correlation, can also be tested.

```
boot_cor_test(mtcars$mpg,
mtcars$qsec,
method = "win",
alternative = "e",
null = .4,
tr = .1) # set trim
```

```
##
## Bootstrapped Winsorized correlation wincor
##
## data: mtcars$mpg and mtcars$qsec
## N = 32, p-value = 0.6878
## alternative hypothesis: equivalence
## null values:
## wincor wincor
## 0.4 -0.4
## 90 percent confidence interval:
## 0.2163284 0.6629980
## sample estimates:
## wincor
## 0.464062
```

```
boot_cor_test(mtcars$mpg,
mtcars$qsec,
method = "bend",
alternative = "e",
null = .4,
beta = .15) # bend argument
```

```
##
## Bootstrapped percentage bend correlation pb
##
## data: mtcars$mpg and mtcars$qsec
## N = 32, p-value = 0.6933
## alternative hypothesis: equivalence
## null values:
## pb pb
## 0.4 -0.4
## 90 percent confidence interval:
## 0.2341348 0.6455867
## sample estimates:
## pb
## 0.4484488
```

In some cases, researchers may want to compare two independent correlations. Sometimes this may be used to compare correlations between two variables between two groups (e.g., the correlation between two variables between male and female subjects) or between two independent studies (e.g., replication study).

When only summary statistics are available the
`compare_cor`

function can be used. All the user needs is the
correlations (r1 and r2) and the degrees of freedom for each
correlation. The degrees of freedom for most cases would the number of
pairs minus 2 (\(df = N-2\)).
*Note*: this function, similar to `z_cor_test`

is an
approximation.

```
compare_cor(r1 = .8,
df1 = 38,
r2 = .2,
df2 = 98)
```

```
##
## Difference between two independent correlations (Fisher's z transform)
##
## data: Summary Statistics
## z = 4.6364, p-value = 3.545e-06
## alternative hypothesis: true difference between correlations is not equal to 0
## sample estimates:
## difference between correlations
## 0.6
```

```
compare_cor(r1 = .8,
df1 = 38,
r2 = .2,
df2 = 98)
```

```
##
## Difference between two independent correlations (Fisher's z transform)
##
## data: Summary Statistics
## z = 4.6364, p-value = 3.545e-06
## alternative hypothesis: true difference between correlations is not equal to 0
## sample estimates:
## difference between correlations
## 0.6
```

The methods included to compare correlations include Fisher’s z transformation (“fisher”), and Kraatz’s method (“kraatz”). The Fisher and Kraatz methods are appropriate for general significance tests, but may have low statistical power (Counsell and Cribbie 2015). The Fisher’s method can test the difference between correlations on the z-transformed scale while Kraatz’s methods directly measures the difference between the correlation coefficients. My personal recommendation would is Fisher’s method.

```
compare_cor(r1 = .8,
df1 = 38,
r2 = .2,
df2 = 98,
null = .2,
method = "f", # Fisher
alternative = "e") # Equivalence
```

```
##
## Difference between two independent correlations (Fisher's z transform)
##
## data: Summary Statistics
## z = 0.69315, p-value = 0.9998
## alternative hypothesis: equivalence
## null values:
## difference between correlations difference between correlations
## 0.2 -0.2
## sample estimates:
## difference between correlations
## 0.6
```

When data is available for both correlations then the
`boot_compare_cor`

function can be utilized.

```
set.seed(8922)
x1 = rnorm(40)
y1 = rnorm(40)
x2 = rnorm(100)
y2 = rnorm(100)
boot_compare_cor(
x1 = x1,
x2 = x2,
y1 = y1,
y2 = y2,
null = .2,
alternative = "e", # Equivalence
method = "win" # Winsorized correlation
)
```

```
##
## Bootstrapped difference in Winsorized correlation wincor
##
## data: x1 and y1 vs. x2 and y2
## n1 = 40, n2 = 100, p-value = 0.7739
## alternative hypothesis: true differnce in wincor is 0.2
## 90 percent confidence interval:
## -0.2970547 0.3978333
## sample estimates:
## wincor
## 0.06383164
```

Counsell, Alyssa, and Robert A Cribbie. 2015. “Equivalence Tests
for Comparing Correlation and Regression Coefficients.”
*British Journal of Mathematical and Statistical Psychology* 68
(2): 292–309. https://doi.org/10.1111/bmsp.12045.

Goertzen, Jason R, and Robert A Cribbie. 2010. “Detecting a Lack
of Association: An Equivalence Testing Approach.” *British
Journal of Mathematical and Statistical Psychology* 63 (3): 527–37.
https://doi.org/10.1348/000711009X475853.

Wilcox, Rand R. 2011. *Introduction to Robust Estimation and
Hypothesis Testing*. Academic press.

Bootstrapped functions were based off code posted by Rand Wilcox on his website, and was modified after looking at Guillaume Rousselet’s code,

`bootcorci`

R package, on GitHub https://github.com/GRousselet↩︎