Descriptive Statistics is concerned with summarizing and presenting data.
Data are recorded facts, such as census records, economic statistics, seismographic logs and the results of studies, clinical trials, surveys and experiments.
A dataset is an organized collection of data, typically arranged as a table with rows (records) and columns.
For example
The variables of the dataset are: Name, Population (in millions), Area (in thousands of square miles) and GDP (in billions of dollars). The variable Name identifies the entity (country) to which the other variables pertain.
The values of the variable Population, for example, are: 1,423, 1,438, 145, and 343. And the values of Area are: 3,601, 1,148, 6,323, and 3,537.
Categorical, Ordinal, Interval, and Ratio Variables
Categorical (Qualitative, Nominal) Variables
The values of categorical variables are labels with no meaningful order. For example:
Eye color: blue, brown, green
Blood type: A, B, AB, O
Car brand: Toyota, Ford, Honda
Categorical values may be numeric, for example zip codes. In which case x = y and x ≠ y are meaningful but not x > y and x < y.
Ordinal Variables
The order of the values of ordinal variables, and only the order, is meaningful. That is, x > y and x < y are meaningful in addition to x = y and x ≠ y. But x – y and x + y are meaningless.
For example, the values of Education Level might be the numbers 1 to 6 according as:
1 = No high school diploma
2 = High school diploma or GED.
3 = Some college credits earned but no degree.
4 = Associate degree
5 = Bachelor’s Degree
6 = Master’s, doctorate, or professional degree
Suppose Amy has educational level x and Mike educational level y. Then:
If x > y, Amy has had more formal education than Mike.
But what does 5 – 4 = 2 – 1 mean? That the difference in education level between having a bachelor degree and having an associate degree equals the difference in educational level between having and not having a high school diploma.
Interval Variables
The values of interval variables have equal intervals as well as meaningful order. That is, x – y and x + y are meaningful in addition to x > y and x < y. But x / y is not meaningful.
Consider year-of-birth. Suppose Amy is born in year x and Mike in year y. Then:
If x > y, Amy is younger than Mike.
If x – y = 2, Amy is two years younger than Mike.
But x / y = 2 does not mean that Amy is twice as old as Mike.
Other examples of interval variables are temperature in Fahrenheit or Celsius, IQ scores, and dates.
Ratio Variables
The values of ratio variables, like those of interval variables, have equal intervals and meaningful order. They also have meaningful ratios and a “true zero,” denoting a complete lack of the variable.
Consider a person’s age in years. Suppose Amy is x years of age and Mike is y years of age. Then:
If x > y, Amy is older than Mike.
If x – y = 2, Amy is two years older than Mike.
If x / y = 2, Amy is twice as old as Mike. (Meaningful Ratio)
If x = 1, Amy is one year old.
If x = 1/365, Amy is one day old. (Meaningful Ratio)
If x = 0 Amy has no age. (True Zero).
Other examples are: weight, height, income, distance, duration of time, and temperature in Kelvin.