Stochastic Superiority from Ordinal Models

Introduction

When analyzing ordinal data — like Likert scales, severity ratings, or satisfaction scores— traditional effect sizes often fall short of providing intuitive interpretations. Enter stochastic superiority: a probability-based measure that answers the simple question, “What’s the probability that a randomly selected observation from group A will have a higher outcome than a randomly selected observation from group B?”

In the previous post, I discussed how stochastic superiority could be calculated from predicted probabilities. In this post, we’ll explore how to calculate stochastic superiority for ordinal data using both approximate methods based on model coefficients and exact calculations using predicted probabilities. We’ll demonstrate multiple approaches using R packages like rms, ordinal to the models and emmeans, and marginaleffects to produce the effect size in tersm of stochastic superiority.

What is Stochastic Superiority?

Stochastic superiority, also known as the probability of superiority or concordance probability, is defined mathematically as:

\[c_{A,B} = P(Y_A > Y_B) + \frac{1}{2} P(Y_A = Y_B)\]

This measure ranges from 0 to 1, where: - 0.5 indicates no difference between groups - Values > 0.5 suggest group A tends to have higher outcomes - Values < 0.5 suggest group A tends to have lower outcomes

Setup and Data

Let’s start by loading our packages and data. We’ll use the wine dataset from the ordinal package, which contains wine ratings on an ordinal scale.

library(rms)
library(ordinal)
library(emmeans)
library(marginaleffects)
library(tidyverse)

# Load the wine dataset
data(wine)

# Quick look at the data structure
str(wine)

'data.frame':   72 obs. of  6 variables:
 $ response: num  36 48 47 67 77 60 83 90 17 22 ...
 $ rating  : Ord.factor w/ 5 levels "1"<"2"<"3"<"4"<..: 2 3 3 4 4 4 5 5 1 2 ...
 $ temp    : Factor w/ 2 levels "cold","warm": 1 1 1 1 2 2 2 2 1 1 ...
 $ contact : Factor w/ 2 levels "no","yes": 1 1 2 2 1 1 2 2 1 1 ...
 $ bottle  : Factor w/ 8 levels "1","2","3","4",..: 1 2 3 4 5 6 7 8 1 2 ...
 $ judge   : Factor w/ 9 levels "1","2","3","4",..: 1 1 1 1 1 1 1 1 2 2 ...

table(wine$rating, wine$contact)

Custom Functions

First, let’s define our key functions for calculating stochastic superiority.

Fast Stochastic Superiority Calculation

#' Calculate stochastic superiority of group A over group B
#' 
#' @param prob_a Numeric vector of probabilities for group A (in ordinal order)
#' @param prob_b Numeric vector of probabilities for group B (in ordinal order)  
#' @return Numeric value between 0 and 1 representing P(A > B) + 0.5 * P(A = B)

stochastic_superiority_fast <- function(prob_a, prob_b) {
  # Input validation
  if (length(prob_a) != length(prob_b)) {
    stop("prob_a and prob_b must have the same length")
  }
  
  n <- length(prob_a)
  
  # Step 1: Create probability matrices
  a_matrix <- matrix(rep(prob_a, n), nrow = n, byrow = FALSE) # pi_A in columns
  b_matrix <- matrix(rep(prob_b, n), nrow = n, byrow = TRUE)  # pi_B in rows
  
  # Step 2: Joint probabilities
  joint_probs <- a_matrix * b_matrix # Element-wise: pi_{A,i} * pi_{B,j}
  
  # Step 3: Weight matrix using row/column indices
  weights <- ifelse(row(joint_probs) > col(joint_probs), 1,     # i > j
                   ifelse(row(joint_probs) == col(joint_probs), 0.5, 0))  # i = j or i < j
  
  # Step 4: Sum weighted joint probabilities
  return(sum(joint_probs * weights))
}

Approximate Stochastic Superiority from Coefficients

We can also approximate stochastic superiority directly from model coefficients using formulas derived by Agresti and others:

#' Calculate Ordinal Superiority Measure (gamma)
#' 
#' This function calculates the ordinal superiority measure gamma = P(y1 > y2; x)
#' based on a coefficient β from ordinal regression models with different link functions.
#' 
#' @param beta A numeric value or vector representing the group effect coefficient
#' @param link Character string specifying the link function
#' @return Numeric value or vector of ordinal superiority measures γ

ordinal_superiority <- function(beta, link = "probit") {
  # Input validation
  if (!is.numeric(beta)) {
    stop("beta must be numeric")
  }
  
  link <- tolower(link)
  valid_links <- c("probit", "logit", "loglog")
  
  if (!link %in% valid_links) {
    stop("link must be one of: ", paste(valid_links, collapse = ", "))
  }
  
  # Calculate gamma based on link function
  gamma <- switch(link,
                  "probit" = {
                    pnorm(beta / sqrt(2))
                  },
                  "logit" = {
                    exp_term <- exp(beta / sqrt(2))
                    exp_term / (1 + exp_term)
                  },
                  "loglog" = {
                    exp_term <- exp(beta)
                    exp_term / (1 + exp_term)
                  }
  )
  
  return(gamma)
}

Model Fitting

Let’s fit ordinal regression models using both rms and ordinal packages with different link functions.

I use both packages because calculating, or really the ability to calculate, stochastic superiority from either package differs.

# Using rms package
orm_logit <- robcov(orm(
  rating ~ contact + temp,
  data = wine,
  family = "logistic",
  x = TRUE,
  y = TRUE
), wine$judge)

orm_probit <- robcov(orm(
  rating ~ contact + temp,
  data = wine,
  family = "probit",
  x = TRUE,
  y = TRUE
), wine$judge)

orm_loglog <- robcov(orm(
  rating ~ contact + temp,
  data = wine,
  family = "loglog",
  x = TRUE, 
  y = TRUE
), wine$judge)

# Using ordinal package
clm_logit <- clm(
  rating ~ contact + temp,
  data = wine,
  link = "logit"
)

clm_probit <- clm(
  rating ~ contact + temp,
  data = wine,
  link = "probit"
)

clm_loglog <- clm(
  rating ~ contact + temp,
  data = wine,
  link = "loglog"
)

Method 1: Approximate Stochastic Superiority Using emmeans

The first approach uses emmeans to extract contrasts on the linear predictor scale, then applies the appropriate transformation. Note that emmeans takes the approach of calculating the marginal effect at the mean, so when dealing with models predictions that have issues with non-collapsibility, the average effect and the average effect at the mean will differ.

# Function to calculate stochastic superiority with confidence intervals
calc_stoch_sup_emmeans <- function(model, link_type) {
  emmeans(model, ~ contact) |>
    contrast(method = "revpairwise") |>
    data.frame() |>
    mutate(
      stoch_sup = ordinal_superiority(estimate, link = link_type),
      lower.ci = ordinal_superiority(estimate - qnorm(.975) * SE, link = link_type),
      upper.ci = ordinal_superiority(estimate + qnorm(.975) * SE, link = link_type)
    ) |>
    select(contrast, stoch_sup, lower.ci, upper.ci)
}

# Calculate for all models
results_emmeans <- bind_rows(
  calc_stoch_sup_emmeans(orm_logit, "logit") |> mutate(model = "ORM Logit"),
  calc_stoch_sup_emmeans(orm_probit, "probit") |> mutate(model = "ORM Probit"),
  calc_stoch_sup_emmeans(orm_loglog, "loglog") |> mutate(model = "ORM Log-log"),
  calc_stoch_sup_emmeans(clm_logit, "logit") |> mutate(model = "CLM Logit"),
  calc_stoch_sup_emmeans(clm_probit, "probit") |> mutate(model = "CLM Probit"),
  calc_stoch_sup_emmeans(clm_loglog, "loglog") |> mutate(model = "CLM Log-log")
)

print(results_emmeans)

  contrast stoch_sup  lower.ci  upper.ci       model
1 yes - no 0.7465538 0.6254424 0.8386096   ORM Logit
2 yes - no 0.7302560 0.6179782 0.8230366  ORM Probit
3 yes - no 0.7026008 0.5636053 0.8120858 ORM Log-log
4 yes - no 0.7465538 0.6034265 0.8507974   CLM Logit
5 yes - no 0.7302560 0.5962606 0.8373180  CLM Probit
6 yes - no 0.7121081 0.5875933 0.8111138 CLM Log-log

Method 2: Exact Calculation Using marginaleffects

The second approach uses predicted probabilities to calculate stochastic superiority:

# Helper functions for marginaleffects
marg_superiority <- function(x) {
  marg_superiority_ind(x) |> 
    summarise(
      term = "contact (yes-no)",
      estimate = mean(estimate)
    )
}

marg_superiority_ind <- function(x) {
  x |> 
    arrange(rowidcf, contact) |>
    summarise(
      term = "contact (yes-no)",
      estimate = stochastic_superiority_fast(
        estimate[contact == "yes"],
        estimate[contact == "no"]
      ),
      .by = c(rowidcf)
    )
}

# Function to calculate and format results
calc_stoch_sup_exact <- function(model, model_name) {
  predictions(model, variables = "contact", vcov = ~judge,
              hypothesis = marg_superiority) %>%
    mutate(
      statistic = (estimate - 0.5) / std.error,
      p.value = 2 * pnorm(abs(statistic), lower.tail = FALSE),
      s.value = -log2(p.value),
      model = model_name
    ) %>%
    select(model, estimate, std.error, conf.low, conf.high, statistic, p.value)
}

# Calculate exact stochastic superiority for CLM models
results_exact <- bind_rows(
  calc_stoch_sup_exact(clm_logit, "CLM Logit"),
  calc_stoch_sup_exact(clm_probit, "CLM Probit"),
  calc_stoch_sup_exact(clm_loglog, "CLM Log-log")
)

print(results_exact)


 Estimate Std. Error    z Pr(>|z|) 2.5 % 97.5 %
    0.708     0.0509 4.08   <0.001 0.608  0.808
    0.704     0.0501 4.06   <0.001 0.605  0.802
    0.686     0.0514 3.61   <0.001 0.585  0.786

Method 3: Using Linear Predictors with marginaleffects

We can also use marginaleffects with linear predictors and transform them:

# Function for linear predictor approach
calc_stoch_sup_lp <- function(model, link_type, model_name) {
  avg_predictions(model, variables = "contact", type = "lp",
                  hypothesis = difference ~ pairwise,
                  transform = function(x) {
                    ordinal_superiority(beta = x, link = link_type)
                  }) %>%
    mutate(model = model_name) %>%
    select(model, estimate, conf.low, conf.high)
}

# Calculate for ORM models using linear predictors
results_lp <- bind_rows(
  calc_stoch_sup_lp(orm_logit, "logit", "ORM Logit"),
  calc_stoch_sup_lp(orm_probit, "probit", "ORM Probit"),
  calc_stoch_sup_lp(orm_loglog, "loglog", "ORM Log-log")
)

print(results_lp)


 Estimate 2.5 % 97.5 %
    0.747 0.625  0.839
    0.730 0.618  0.823
    0.703 0.564  0.812

Comparing Results

Let’s create a comprehensive comparison of all methods:

# Combine results for comparison
comparison_results <- bind_rows(
  # emmeans results
  results_emmeans |>
    select(model, stoch_sup) |>
    rename(estimate = stoch_sup) |>
    mutate(method = "emmeans"),
  
  # Exact results  
  results_exact |>
    data.frame() |>
    select(model, estimate) |>
    mutate(method = "exact"),
  
  # Linear predictor results
  results_lp |>
    data.frame() |>
    select(model, estimate) |>
    mutate(method = "linear_predictor")
) |>
  pivot_wider(names_from = method, values_from = estimate) |>
  mutate(across(where(is.numeric), round, 4))

print(comparison_results)

# A tibble: 6 × 4
  model       emmeans  exact linear_predictor
  <chr>         <dbl>  <dbl>            <dbl>
1 ORM Logit     0.747 NA                0.747
2 ORM Probit    0.730 NA                0.730
3 ORM Log-log   0.703 NA                0.703
4 CLM Logit     0.747  0.708           NA    
5 CLM Probit    0.730  0.704           NA    
6 CLM Log-log   0.712  0.686           NA

Visualization

Let’s visualize the results to better understand the differences:

# Prepare data for plotting
plot_data <- bind_rows(
  results_exact |> mutate(method = "Exact (Probabilities)"),
  results_lp |> mutate(method = "Approximate (Linear Predictor)")
) |>
  separate(model, into = c("package", "link"), sep = " ") |>
  mutate(
    link = factor(link, levels = c("Logit", "Probit", "Log-log")),
    method = factor(method)
  )

ggplot(plot_data, aes(x = link, y = estimate, color = method)) +
  geom_point(size = 3, position = position_dodge(width = 0.3)) +
  geom_errorbar(aes(ymin = conf.low, ymax = conf.high), 
                width = 0.2, position = position_dodge(width = 0.3)) +
  geom_hline(yintercept = 0.5, linetype = "dashed", alpha = 0.7) +
  scale_color_brewer(type = "qual", palette = "Set2") +
  labs(
    title = "Stochastic Superiority: Contact Effect Across Link Functions",
    subtitle = "Wine rating data: Probability that 'yes' contact > 'no' contact",
    x = "Link Function",
    y = "Stochastic Superiority",
    color = "Method",
    caption = "Dashed line represents no difference (0.5)"
  ) +
  theme_minimal() +
  theme(
    plot.title = element_text(size = 14, face = "bold"),
    legend.position = "bottom"
  )

Comparison of Stochastic Superiority Estimates Across Methods and Link Functions

Key Insights

Based on our analysis, here are the main takeaways:

1. Method Comparison

Exact calculations using predicted probabilities this will work when using marginaleffects with the ordinal package.
Approximate methods using coefficient transformations are computationally faster but may introduce small biases
Both approaches generally yield very similar results for balanced designs

2. Link Function Effects

Different link functions can produce different stochastic superiority estimates
The choice of link function should be based on the underlying data distribution and model fit
Probit and logit links often yield similar results, while log-log can differ more substantially

3. Practical Interpretation

For our wine data, the stochastic superiority of approximately 0.7 means there’s about a 70% probability that a a randomly selected wine with “contact” will receive a higher rating than a randomly selected wine without contact while controlling for serving temperature.

When to Use Each Method?

In the current software environment, you may be limited in choice by what model you choose. My personal preference is the orm function to fit the model and the emmeans package to provide the estimates of interest. However, you might prefer the ordinal package for a variety of reasons due to features listed within that package. What matters is that you realize what each package is capable of and how marginaleffects or emmeans is able to handle the model. Of significant note is that mixed models become much more tricky. To my knowledge, only the ordinal package has support for this in the clmm and clmm2 functions. As of writing this, the marginaleffects package has extremely limited support for either function. The emmeans package has fairly decent support for at least the clmm function. But, then again, you might not want the “marginal effect at the mean” approach that emmeans provides.

Conclusion

Stochastic superiority provides an intuitive, probability-based effect size for ordinal data that’s easy to interpret and communicate. The marginaleffects package offers powerful tools for exact calculation using predicted probabilities, while coefficient-based approximations provide quick alternatives.

The choice between methods depends on your specific needs for accuracy, computational efficiency, and the complexity of your study design. Regardless of the method chosen, stochastic superiority offers a more meaningful interpretation than traditional approaches for ordinal outcomes.

Session Information

sessionInfo()

R version 4.5.1 (2025-06-13 ucrt)
Platform: x86_64-w64-mingw32/x64
Running under: Windows 11 x64 (build 22631)

Matrix products: default
  LAPACK version 3.12.1

locale:
[1] LC_COLLATE=English_United States.utf8 
[2] LC_CTYPE=English_United States.utf8   
[3] LC_MONETARY=English_United States.utf8
[4] LC_NUMERIC=C                          
[5] LC_TIME=English_United States.utf8    

time zone: America/Chicago
tzcode source: internal

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] lubridate_1.9.4          forcats_1.0.0            stringr_1.5.1           
 [4] dplyr_1.1.4              purrr_1.1.0              readr_2.1.5             
 [7] tidyr_1.3.1              tibble_3.3.0             ggplot2_3.5.2.9002      
[10] tidyverse_2.0.0          marginaleffects_0.30.0.1 emmeans_1.11.2          
[13] ordinal_2023.12-4.1      rms_8.0-0                Hmisc_5.2-3             

loaded via a namespace (and not attached):
 [1] tidyselect_1.2.1    farver_2.1.2        S7_0.2.0           
 [4] fastmap_1.2.0       TH.data_1.1-3       digest_0.6.37      
 [7] rpart_4.1.24        estimability_1.5.1  timechange_0.3.0   
[10] lifecycle_1.0.4     cluster_2.1.8.1     survival_3.8-3     
[13] magrittr_2.0.3      compiler_4.5.1      rlang_1.1.6        
[16] tools_4.5.1         utf8_1.2.6          yaml_2.3.10        
[19] data.table_1.17.8   knitr_1.50          labeling_0.4.3     
[22] htmlwidgets_1.6.4   RColorBrewer_1.1-3  multcomp_1.4-28    
[25] polspline_1.1.25    foreign_0.8-90      withr_3.0.2        
[28] numDeriv_2016.8-1.1 nnet_7.3-20         grid_4.5.1         
[31] xtable_1.8-4        colorspace_2.1-1    scales_1.4.0       
[34] MASS_7.3-65         insight_1.4.2       cli_3.6.5          
[37] mvtnorm_1.3-3       rmarkdown_2.29      generics_0.1.4     
[40] rstudioapi_0.17.1   tzdb_0.5.0          splines_4.5.1      
[43] base64enc_0.1-3     vctrs_0.6.5         Matrix_1.7-3       
[46] sandwich_3.1-1      jsonlite_2.0.0      SparseM_1.84-2     
[49] hms_1.1.3           Formula_1.2-5       htmlTable_2.4.3    
[52] glue_1.8.0          codetools_0.2-20    stringi_1.8.7      
[55] gtable_0.3.6        pillar_1.11.0       htmltools_0.5.8.1  
[58] quantreg_6.1        R6_2.6.1            ucminf_1.2.2       
[61] evaluate_1.0.4      lattice_0.22-7      backports_1.5.0    
[64] MatrixModels_0.5-4  coda_0.19-4.1       gridExtra_2.3      
[67] nlme_3.1-168        checkmate_2.3.3     xfun_0.52          
[70] zoo_1.8-14          pkgconfig_2.0.3