Descriptive Statistics

Data, Variables, and Values

  • Descriptive Statistics is concerned with summarizing and presenting data.
  • Data are recorded facts, such as census records, economic statistics, seismographic logs and the results of studies, clinical trials, surveys and experiments.
  • A dataset is an organized collection of data, typically arranged as a table with rows (records) and columns.
  • For example
  • The variables of the dataset are: Name, Population (in millions), Area (in thousands of square miles) and GDP (in billions of dollars). The variable Name identifies the entity (country) to which the other variables pertain.
  • The values of the variable Population, for example, are: 1,423, 1,438, 145, and 343. And the values of Area are: 3,601, 1,148, 6,323, and 3,537.

Categorical, Ordinal, Interval, and Ratio Variables

Categorical (Qualitative, Nominal) Variables

  • The values of categorical variables are labels with no meaningful order. For example:
    • Eye color: blue, brown, green
    • Blood type: A, B, AB, O
    • Car brand: Toyota, Ford, Honda
  • Categorical values may be numeric, for example zip codes.  In which case x = y and x ≠ y are meaningful but not  x > y and x < y. 

Ordinal Variables

  • The order of the values of ordinal variables, and only the order, is meaningful. That is,  x > y and x < y are meaningful in addition to x = y and x ≠ y.  But x – y and x + y are meaningless.
  • For example, the values of Education Level might be the numbers 1 to 6 according as:
    • 1 = No high school diploma
    • 2 = High school diploma or GED.
    • 3 = Some college credits earned but no degree.
    • 4 = Associate degree
    • 5 = Bachelor’s Degree
    • 6 = Master’s, doctorate, or professional degree
  • Suppose Amy has educational level x and Mike educational level y. Then:
    • If x > y, Amy has had more formal education than Mike.
  • But what does 5 – 4 = 2 – 1 mean? That the difference in education level between having a bachelor degree and having an associate degree equals the difference in educational level between having and not having a high school diploma.

Interval Variables

  • The values of interval variables have equal intervals as well as meaningful order. That is, x – y and x + y are meaningful in addition to x > y and x < y.  But x / y is not meaningful.
  • Consider year-of-birth. Suppose Amy is born in year x and Mike in year y. Then:
    • If x > y, Amy is younger than Mike.
    • If x – y = 2, Amy is two years younger than Mike.
    • But x / y = 2 does not mean that Amy is twice as old as Mike.
  • Other examples of interval variables are temperature in Fahrenheit or Celsius, IQ scores, and dates.

Ratio Variables

  • The values of ratio variables, like those of interval variables, have equal intervals and meaningful order. They also have meaningful ratios and a “true zero,” denoting a complete lack of the variable.
  • Consider a person’s age in years.  Suppose Amy is x years of age and Mike is y years of age. Then:
    • If x > y, Amy is older than Mike.
    • If x – y = 2, Amy is two years older than Mike.
    • If x / y = 2, Amy is twice as old as Mike. (Meaningful Ratio)
    • If x = 1, Amy is one year old.
    • If x = 1/365, Amy is one day old. (Meaningful Ratio)
    • If x = 0 Amy has no age. (True Zero).
  • Other examples are: weight, height, income, distance, duration of time, and temperature in Kelvin.