Mathematical Explanation
Definition
Stochastic superiority quantifies the probability that a randomly selected observation from group A will have a higher ordinal outcome than a randomly selected observation from group B, with ties counted as half-wins. In some parts of the literature, similar forms of this statistic are called the “Common Language Effect Size”1 or even the “probability of superiority”2. This is also referred to as the “concordance probability” by Frank Harrell. For this reason, and for simplicity, I will refer to the estimate of stochastic superiority as “\(c\)”.
For two random variables \(Y_A\) and \(Y_B\) representing ordinal outcomes from groups A and B respectively, stochastic superiority is defined as:
\[c_{A,B} = P(Y_A > Y_B) + \frac{1}{2} P(Y_A = Y_B)\]
Implementation Context
Our implementation assumes we have predicted probabilities from an ordinal model (e.g., proportional odds model) that provides:
- \(\pi_{A,k} = P(Y_A = k)\) for group A across ordinal categories \(k = 1, 2, \ldots, K\)
- \(\pi_{B,k} = P(Y_B = k)\) for group B across ordinal categories \(k = 1, 2, \ldots, K\)
Where categories are ordered such that \(k = 1\) represents the lowest ordinal value and \(k = K\) represents the highest.
Mathematical Derivation
Probability Decomposition
Since we have discrete ordinal outcomes, we can decompose the stochastic superiority as:
\[c_{A,B} = \sum_{i=1}^{K} \sum_{j=1}^{K} \pi_{A,i} \pi_{B,j} \cdot w_{i,j}\]
where the weight function \(w_{i,j}\) is defined as:
\[w_{i,j} = \begin{cases} 1 & \text{if } i > j \text{ (A wins)} \\ 0.5 & \text{if } i = j \text{ (tie)} \\ 0 & \text{if } i < j \text{ (B wins)} \end{cases}\]
Matrix Representation
The stochastic_superiority_fast
function implements this calculation using matrix operations:
Create probability matrices:
- \(\mathbf{A} = \text{matrix}(\boldsymbol{\pi}_A, n \times n)\) where each column contains \(\boldsymbol{\pi}_A\)
- \(\mathbf{B} = \text{matrix}(\boldsymbol{\pi}_B, n \times n)^T\) where each row contains \(\boldsymbol{\pi}_B\)
Joint probability matrix: \[\mathbf{P}_{i,j} = \mathbf{A}_{i,j} \times \mathbf{B}_{i,j} = \pi_{A,i} \times \pi_{B,j}\]
Weight matrix: \[\mathbf{W}_{i,j} = \begin{cases} 1 & \text{if } i > j \\ 0.5 & \text{if } i = j \\ 0 & \text{if } i < j \end{cases}\]
Final calculation: \[c_{A,B} = \sum_{i=1}^{K} \sum_{j=1}^{K} \mathbf{P}_{i,j} \times \mathbf{W}_{i,j}\]
Code Implementation Details
<- function(prob_a, prob_b) {
stochastic_superiority_fast #' Fast vectorized calculation of stochastic superiority
#'
#' param prob_a Numeric vector of probabilities for group A (in ordinal order)
#' param prob_b Numeric vector of probabilities for group B (in ordinal order)
#' return Numeric value between 0 and 1
# Input validation
if (length(prob_a) != length(prob_b)) {
stop("prob_a and prob_b must have the same length")
}
<- length(prob_a)
n
# Step 1: Create probability matrices
<- matrix(rep(prob_a, n), nrow = n, byrow = FALSE) # pi_A in columns
a_matrix <- matrix(rep(prob_b, n), nrow = n, byrow = TRUE) # pi_B in rows
b_matrix
# Step 2: Joint probabilities
<- a_matrix * b_matrix # Element-wise: pi_{A,i} * pi_{B,j}
joint_probs
# Step 3: Weight matrix using row/column indices
# Create superiority weights (1 if row > col, 0.5 if row = col, 0 if row < col)
<- ifelse(row(joint_probs) > col(joint_probs), 1, # i > j
weights ifelse(row(joint_probs) == col(joint_probs), 0.5, 0)) # i = j or i < j
# Step 4: Sum weighted joint probabilities
return(sum(joint_probs * weights))
}
Computational Complexity
- Time complexity: \(O(K^2)\) where \(K\) is the number of ordinal categories
- Space complexity: \(O(K^2)\) for storing the matrices
- Vectorization: The matrix operations are vectorized, making this more efficient than nested loops for moderate values of \(K\)
Properties and Interpretation
Range and Interpretation
- \(c_{A,B} \in [0, 1]\)
- \(c_{A,B} = 0.5\) indicates no stochastic difference between groups
- \(c_{A,B} > 0.5\) indicates group A tends to have higher ordinal outcomes
- \(c_{A,B} < 0.5\) indicates group A tends to have lower ordinal outcomes
- \(c_{A,B} = 1 - c_{B,A}\) (asymmetry property)
Relationship to Mann-Whitney U
For continuous variables, stochastic superiority equals the Mann-Whitney U statistic divided by \(n_A \times n_B\). Our discrete version maintains this probabilistic interpretation while accounting for ties through the half-weight approach.
Connection to Ordinal Models
When working with proportional odds models or other ordinal regression models, the predicted probabilities \(\pi_{A,k}\) and \(\pi_{B,k}\) naturally incorporate:
- The ordinal structure of the outcome
- Covariate effects through the linear predictor
- Model uncertainty through the link function
This makes stochastic superiority a natural effect size measure for ordinal outcomes.
Assumptions
- Ordinal structure: Categories have a meaningful order (\(1 < 2 < \cdots < K\))
- Probability vectors: Input vectors represent valid probability distributions (non-negative, typically sum to 1)
- Independent sampling: Observations from groups A and B are independently sampled
- Equal category definitions: Both groups use the same ordinal scale
Applications
This measure is particularly useful for:
- Comparing treatment effects in ordinal outcomes (e.g., Likert scales, severity ratings)
- Evaluating model predictions across different subgroups
- Providing an intuitive effect size measure for ordinal data
- Clinical trials with ordinal endpoints (e.g., improvement scales)
Footnotes
McGraw, K. O., & Wong, S. P. (1992). A common language effect size statistic. Psychological Bulletin, 111(2), 361-365. https://doi.org/10.1037/0033-2909.111.2.361↩︎
Grissom, R. J. (1994). Probability of the superior outcome of one treatment over another. Journal of Applied Psychology, 79(2), 314-316. https://doi.org/10.1037/0021-9010.79.2.314↩︎