Why not use linear regression for star ratings?

Zachary Levonian
3 min readNov 21, 2021

OLS can be misleading for ordinal variables

When we have ordered categorical (ordinal) data, such as star ratings, it is common to treat this data as continuous and model it using Ordinary Least Squares (OLS). However, this approach has been criticized by some statisticians.

Here’s Alan Agresti in Analysis of Ordinal Categorical Data (2010, 2nd ed.):

We recommend against the simplistic approach of posing linear regression models for ordinal response scores and fitting them using OLS methods. Although that approach can be useful for identifying variables that clearly affect a response variable, and for simple descriptions, limitations occur. First, there is usually not a clear-cut choice for the scores. Second, a particular response outcome is likely to be consistent with a range of values for some underlying latent variable, and an ordinary regression analysis does not allow for the measurement error that results from replacing such a range by a single numerical value. Third, unlike the methods presented in this book, that approach does not yield estimated probabilities for the response categories at fixed settings of the explanatory variables. Fourth, that approach can yield predicted values above the highest category score or below the lowest. Fifth, that approach ignores the fact that the variability of the responses is naturally nonconstant for categorical data: For an ordinal response variable, there is little variability at predictor values for which observations fall mainly in the highest category (or mainly in the lowest category), but there is considerable variability at predictor values for which observations tend to be spread among the categories.

Related to the second, fourth, and fifth limitations, the ordinary regression approach does not account for “ceiling effects” and “floor effects,” which occur because of the upper and lower limits for the ordinal response variable. Such effects can cause ordinary regression modeling to give misleading results. These effects also result in substantial correlation between values of residuals and values of quantitative explanatory variables.

Emphasis added. Despite these five reasons, the other discussions I’ve looked at suggest it’s largely inoffensive in most modeling contexts. Further, there are real benefits to using linear regression, not the least of which is that OLS is wide-spread for such analyses.

I’m interested in additional pointers and discussion on this topic; I’m not aware of any systematic comparison of OLS vs non-OLS modeling approaches for ordinal outcomes.

A useful pre-print summarizing the debate with a focus on Likert scales in psychology: “When can we treat Likert type data as interval?”

Uh oh: “analyzing ordinal data as if they were metric can systematically lead to errors”: Read more in “Analyzing ordinal data with metric models: What could possibly go wrong?”

A famous defense of OLS for Likert scales: “Likert scales, levels of measurement and the ‘laws’ of statistics”

Edit (Jan. 2022): Ordered Beta Regression may be a useful alternative to OLS for ordinal data. (Twitter thread link; preprint link)

Edit (Apr. 2022): For the HCI community, Judy Robertson writes on issues with treating Likert scale data as interval, especially using non-regression statistical methods. She recommends the following flowchart from “How to Report and Design Experiments”:

Pg. 274 of “How to Report and Design Experiments” by Field and Hole.

Martin Schmettow responds to Robertson in the linked piece, arguing you should just use linear regression (and other regression methods).

--

--

Zachary Levonian

Currently: Machine Learning Engineer in industry. Computer Science PhD in HCI and social computing. More: levon003.github.io