What is the best way to represent ordinal data?

There are not a lot of statistical methods designed just for ordinal variables.

But that doesnt mean that youre stuck with few options. There are more than youd think.

Some are better than others, but it depends on the situation and research questions.

Here are five options when your dependent variable is ordinal.

1. Treat ordinal variables as nominal

Ordinal variables are fundamentally categorical. One simple option is to ignore the order in the variables categories and treat it as nominal. There are many options for analyzing categorical variables that have no order.

This can make a lot of sense for some variables. For example, when there are few categories and the order isnt central to the research question.

The biggest advantage to this approach is you wont violate any assumptions.

2. Treat ordinal variables as numeric

Because the ordering of the categories often is central to the research question, many data analysts do the opposite: ignore the fact that the ordinal variable really isnt numerical and treat the numerals that designate each category as actual numbers.

This approach requires the assumption that the distance between each set of subsequent categories is equal. And that can be very difficult to justify.

So think long and hard about whether youre able to justify this assumption.

3. Non-parametric tests

Some good news: there are other options.

Many non-parametric descriptive statistics are based on ranking numerical values. Ranks are themselves ordinalthey tell you information about the order, but no distance between values.

Just like other ordinal variables.

So while we think of these tests as useful for numerical data that are non-normal or have outliers, they work for ordinal variables as well, especially when there are more than just a few ordered categories.

Common rank-based non-parametric tests include Kruskal-Wallis, Spearman correlation, Wilcoxon-Mann-Whitney, and Friedman.

Each test has a specific test statistic based on those ranks, depending on whether the test is comparing groups or measuring an association.

The limitation of these tests, though, is theyre pretty basic. Sure you can compare groups one-way ANOVA style or measure a correlation, but you cant go beyond that. You cant, for example, include interactions among two independent variables or include covariates.

You need a real model to do that.

4. Ordinal logistic & probit regression

There arent many tests that are set up just for ordinal variables, but there are a few. One of the most commonly used is ordinal models for logistic (or probit) regression.

There are a few different ways of specifying the logit link function so that it preserves the ordering in the dependent variable. The most commonly available in software is the cumulative link function, which allows you to measure the effect of predictors on the odds of moving into any next-highest-ordered category.

These models are complex, have their own assumptions, and can take some practice to interpret. But they are also sometimes exactly what you need.

They are a very good tool to have in your statistical toolbox.

5. Rank transformations

Another model-based approach combines the advantages of ordinal logistic regression and the simplicity of rank-based non-parametrics.

The basic idea is a rank transformation: transform each ordinal outcome score into the rank of that score and run your regression, two-way ANOVA, or other model on those ranks.

The thing to remember though, is that all results need to be interpreted in terms of the ranks. Just as a log transformation on a dependent variable puts all the means and coefficients on a log(DV) scale, the rank transformation puts everything on a rank scale. Your interpretations are going to be about mean ranks, not means.

What is the best way to represent ordinal data?
What is the best way to represent ordinal data?
Binary, Ordinal, and Multinomial Logistic Regression for Categorical Outcomes
Get beyond the frustration of learning odds ratios, logit link functions, and proportional odds assumptions on your own. See the incredible usefulness of logistic regression and categorical data analysis in this one-hour training.
Take Me to The Video!