Rank Difference Transformation for Paired Data

Applies the Kornbrot (1990) rank difference transformation to paired data. All 2n observations are jointly ranked using midranks for ties, and the ranks corresponding to each condition are returned. The transformed data can then be passed to ses_calc() or boot_ses_calc() for effect size estimation that is invariant under monotone transformations of the original scale.

Usage

rank_diff(x, y, names = c("x", "y"))

Arguments

x: numeric vector of observations from condition 1.
y: numeric vector of observations from condition 2, same length as x. Pairs are defined positionally: x[i] is paired with y[i].
names: optional character vector of length 2 giving column names for the returned data frame. Default is c("x", "y").

Value

A data frame with two columns (named by names) containing the joint ranks for condition 1 and condition 2, respectively. The number of rows equals the number of complete pairs. Missing-value pairs (where either x[i] or y[i] is NA) are removed before ranking, and a message is printed if any pairs are dropped.

Details

The standard Wilcoxon signed-rank procedure for paired data computes differences \(d_i = x_i - y_i\), then ranks the absolute values \(|d_i|\). This is meaningful only when the differences themselves are on an interval scale (i.e., when it makes sense to say that one difference is "larger" than another).

For purely ordinal data, the differences may not be rankable. The Kornbrot (1990) rank difference procedure addresses this by:

Pooling all 2n observations from both conditions into a single vector.
Ranking the pooled vector using standard midranks for ties.
Returning the ranks corresponding to each condition.

The resulting rank differences \(R(x_i) - R(y_i)\) are then suitable for paired signed-rank effect size computation. Because the transformation uses only the ordinal information in the data, the effect size is invariant under any monotone (order-preserving) transformation of the original scale.

Usage with ses_calc

Pass the transformed columns directly to ses_calc(..., paired = TRUE):


  rd <- rank_diff(x, y)
  ses_calc(x = rd$x, y = rd$y, paired = TRUE, ses = "rb")

Because mu has no meaningful interpretation on the joint-rank scale, always use mu = 0 (the default) when analysing rank-difference data.

References

Kornbrot, D. E. (1990). The rank difference test: A new and meaningful alternative to the Wilcoxon signed ranks test for ordinal data. British Journal of Mathematical and Statistical Psychology, 43, 241-264.

Examples

# Kornbrot (1990) Tables 1-2: time vs rate give different
# standard Wilcoxon results but identical rank difference results
time_plac <- c(4.6, 4.3, 6.7, 5.8, 5.0, 4.2, 6.0,
               2.0, 2.6, 10.0, 3.4, 7.1, 8.6)
time_drug <- c(2.9, 2.8, 12.0, 3.8, 5.9, 6.5, 3.3,
               2.3, 2.1, 14.3, 2.4, 14.0, 4.9)

# Standard approach: different results for time vs rate
ses_calc(time_plac, time_drug, paired = TRUE, ses = "rb")
#> 
#> 	Paired Sample Rank-Biserial Correlation estimate with CI
#> 
#> data:  time_plac and time_drug
#> 
#> alternative hypothesis: none
#> 95 percent confidence interval:
#>  -0.5639551  0.4843203
#> sample estimates:
#> P(X - Y>0) - P(X - Y<0) 
#>             -0.05494505 
#> 
ses_calc(60 / time_plac, 60 / time_drug, paired = TRUE, ses = "rb")
#> 
#> 	Paired Sample Rank-Biserial Correlation estimate with CI
#> 
#> data:  60/time_plac and 60/time_drug
#> 
#> alternative hypothesis: none
#> 95 percent confidence interval:
#>  -0.85655157  0.07612983
#> sample estimates:
#> P(X - Y>0) - P(X - Y<0) 
#>              -0.5384615 
#> 

# Rank difference approach: identical results
rd_time <- rank_diff(time_plac, time_drug)
rd_rate <- rank_diff(60 / time_plac, 60 / time_drug)
ses_calc(rd_time$x, rd_time$y, paired = TRUE, ses = "rb")
#> 
#> 	Paired Sample Rank-Biserial Correlation estimate with CI
#> 
#> data:  rd_time$x and rd_time$y
#> 
#> alternative hypothesis: none
#> 95 percent confidence interval:
#>  -0.2350143  0.7613122
#> sample estimates:
#> P(X - Y>0) - P(X - Y<0) 
#>               0.3626374 
#> 
ses_calc(rd_rate$x, rd_rate$y, paired = TRUE, ses = "rb")
#> 
#> 	Paired Sample Rank-Biserial Correlation estimate with CI
#> 
#> data:  rd_rate$x and rd_rate$y
#> 
#> alternative hypothesis: none
#> 95 percent confidence interval:
#>  -0.7613122  0.2350143
#> sample estimates:
#> P(X - Y>0) - P(X - Y<0) 
#>              -0.3626374 
#>