Welcome to this week’s post, which is the first installment of Statistical Analysis Question Monday. Not only does this post mark the first in this series, but it also marks a return to some regular posting. I’ve taken a few weeks between posts for various reasons, but I really want to get back to work. I figure that if I give myself more structure, like a post that incorporates a day of the week in its name, then I’ll be more successful with posting. Hopefully. We’ll see.
So, here we go.
Let’s start with something basic, like appropriately categorizing data as either nominal, ordinal, interval, or ratio. First off, I like to use N.O.I.R. (french for black; fitting, right?) as a mnemonic to help me remember not only the nomenclature, but the ‘order’ of descriptive power that each type of data carries with it.
The first mistake I see people make in their databases is incorrectly labeling one type of data as another. This is actually probably less of a mistake and more an error of omission, which is easily discernible in an application like SPSS; switch to variable view, and all of the data are set to “Nominal”. That’s pretty much a dead giveaway that they (a) didn’t know to look for the category, (b) didn’t know that the naming system they learned in bivariate stats was really important, or (c) fell victim to some other confluence of circumstances. Here’s why these things are important.
Nominal data (nom = name) carry with them no numeric value whatsoever. You frequently see nominal data when you watch sports of almost any kind, and players have numbers on the back of their jersey. That’s nominal data. Player #20 is not twice as good as player #10. Player #0 might be really, really bad, or really, really good. We can’t know until we see him or her perform, or look at their other (non-nominal) data. That’s it. Simple, right?
Ordinal data actually have a numeric value. You would use this data when you wanted to rank order (ord = order) something(s). These data are typically used in what is known as a Likert scale. You’ve seen it before. It goes something like this: 1 = I hate these blog posts, 2 = I’m indifferent to these blog posts, 3 = these blog posts shape the world in which I live, and I cannot wait for new ones. That’s what ordinal data are.
Interval data finally take on the quality of having values that most of us are familiar with, at least in the mathematical sense. The simplest example is to think of the numbers on a tape measure. 10 inches is half as long as 20 inches. This is also the first type of data wherein zero carries a real value; in the preceding two types we’d handle zero as an anchor point, at best, and if at all. See how easy this is?
Ratio data represent a slight step up from interval data. Both of these can pretty much be referred to as scalar data, but don’t worry about why. Just trust me. Ratio data, more or less, are different in that they are the product of some sort of function, like miles per hour. Interval data would be simply the miles driven. If you were interested in how fast someone drove those miles, you would divide them by the number of hours it took to travel that distance, which would give you MPH.
Wow. That was easy, right? Properly categorizing data is awesome. When you do it correctly, it’s a super sweet skill. Furthermore, if you haven’t yet met the person you call your soul mate, get ready. A knowledge and understanding of proper nomenclature is like its own pheromone; people won’t know why, but they’ll be totally unable to resist you and your killer naming skills.