Mathematical Explanation

Definition

Stochastic superiority quantifies the probability that a randomly selected observation from group A will have a higher ordinal outcome than a randomly selected observation from group B, with ties counted as half-wins. In some parts of the literature, similar forms of this statistic are called the “Common Language Effect Size”¹ or even the “probability of superiority”². This is also referred to as the “concordance probability” by Frank Harrell. For this reason, and for simplicity, I will refer to the estimate of stochastic superiority as “\(c\)”.

For two random variables \(Y_A\) and \(Y_B\) representing ordinal outcomes from groups A and B respectively, stochastic superiority is defined as:

\[c_{A,B} = P(Y_A > Y_B) + \frac{1}{2} P(Y_A = Y_B)\]

Implementation Context

Our implementation assumes we have predicted probabilities from an ordinal model (e.g., proportional odds model) that provides:

\(\pi_{A,k} = P(Y_A = k)\) for group A across ordinal categories \(k = 1, 2, \ldots, K\)
\(\pi_{B,k} = P(Y_B = k)\) for group B across ordinal categories \(k = 1, 2, \ldots, K\)

Where categories are ordered such that \(k = 1\) represents the lowest ordinal value and \(k = K\) represents the highest.

Mathematical Derivation

Probability Decomposition

Since we have discrete ordinal outcomes, we can decompose the stochastic superiority as:

\[c_{A,B} = \sum_{i=1}^{K} \sum_{j=1}^{K} \pi_{A,i} \pi_{B,j} \cdot w_{i,j}\]

where the weight function \(w_{i,j}\) is defined as:

\[w_{i,j} = \begin{cases} 1 & \text{if } i > j \text{ (A wins)} \\ 0.5 & \text{if } i = j \text{ (tie)} \\ 0 & \text{if } i < j \text{ (B wins)} \end{cases}\]

Matrix Representation

The stochastic_superiority_fast function implements this calculation using matrix operations:

Create probability matrices:
- \(\mathbf{A} = \text{matrix}(\boldsymbol{\pi}_A, n \times n)\) where each column contains \(\boldsymbol{\pi}_A\)
- \(\mathbf{B} = \text{matrix}(\boldsymbol{\pi}_B, n \times n)^T\) where each row contains \(\boldsymbol{\pi}_B\)
Joint probability matrix: \[\mathbf{P}_{i,j} = \mathbf{A}_{i,j} \times \mathbf{B}_{i,j} = \pi_{A,i} \times \pi_{B,j}\]
Weight matrix: \[\mathbf{W}_{i,j} = \begin{cases} 1 & \text{if } i > j \\ 0.5 & \text{if } i = j \\ 0 & \text{if } i < j \end{cases}\]
Final calculation: \[c_{A,B} = \sum_{i=1}^{K} \sum_{j=1}^{K} \mathbf{P}_{i,j} \times \mathbf{W}_{i,j}\]

Code Implementation Details

 stochastic_superiority_fast <- function(prob_a, prob_b) {
#' Fast vectorized calculation of stochastic superiority
#' 
#' param prob_a Numeric vector of probabilities for group A (in ordinal order)
#' param prob_b Numeric vector of probabilities for group B (in ordinal order)
#' return Numeric value between 0 and 1

# Input validation
if (length(prob_a) != length(prob_b)) {
  stop("prob_a and prob_b must have the same length")
}

n <- length(prob_a)

# Step 1: Create probability matrices
a_matrix <- matrix(rep(prob_a, n), nrow = n, byrow = FALSE) # pi_A in columns
b_matrix <- matrix(rep(prob_b, n), nrow = n, byrow = TRUE) # pi_B in rows

# Step 2: Joint probabilities
joint_probs <- a_matrix * b_matrix # Element-wise: pi_{A,i} * pi_{B,j}

# Step 3: Weight matrix using row/column indices
# Create superiority weights (1 if row > col, 0.5 if row = col, 0 if row < col)
weights <- ifelse(row(joint_probs) > col(joint_probs), 1, # i > j
                  ifelse(row(joint_probs) == col(joint_probs), 0.5, 0)) # i = j or i < j
# Step 4: Sum weighted joint probabilities
return(sum(joint_probs * weights))
}

Computational Complexity

Time complexity: \(O(K^2)\) where \(K\) is the number of ordinal categories
Space complexity: \(O(K^2)\) for storing the matrices
Vectorization: The matrix operations are vectorized, making this more efficient than nested loops for moderate values of \(K\)

Properties and Interpretation

Range and Interpretation

\(c_{A,B} \in [0, 1]\)
\(c_{A,B} = 0.5\) indicates no stochastic difference between groups
\(c_{A,B} > 0.5\) indicates group A tends to have higher ordinal outcomes
\(c_{A,B} < 0.5\) indicates group A tends to have lower ordinal outcomes
\(c_{A,B} = 1 - c_{B,A}\) (asymmetry property)

Relationship to Mann-Whitney U

For continuous variables, stochastic superiority equals the Mann-Whitney U statistic divided by \(n_A \times n_B\). Our discrete version maintains this probabilistic interpretation while accounting for ties through the half-weight approach.

Connection to Ordinal Models

When working with proportional odds models or other ordinal regression models, the predicted probabilities \(\pi_{A,k}\) and \(\pi_{B,k}\) naturally incorporate:

The ordinal structure of the outcome
Covariate effects through the linear predictor
Model uncertainty through the link function

This makes stochastic superiority a natural effect size measure for ordinal outcomes.

Assumptions

Ordinal structure: Categories have a meaningful order (\(1 < 2 < \cdots < K\))
Probability vectors: Input vectors represent valid probability distributions (non-negative, typically sum to 1)
Independent sampling: Observations from groups A and B are independently sampled
Equal category definitions: Both groups use the same ordinal scale

Applications

This measure is particularly useful for:

Comparing treatment effects in ordinal outcomes (e.g., Likert scales, severity ratings)
Evaluating model predictions across different subgroups
Providing an intuitive effect size measure for ordinal data
Clinical trials with ordinal endpoints (e.g., improvement scales)

Footnotes

McGraw, K. O., & Wong, S. P. (1992). A common language effect size statistic. Psychological Bulletin, 111(2), 361-365. https://doi.org/10.1037/0033-2909.111.2.361 ↩︎
Grissom, R. J. (1994). Probability of the superior outcome of one treatment over another. Journal of Applied Psychology, 79(2), 314-316. https://doi.org/10.1037/0021-9010.79.2.314 ↩︎