Random Variables and Probability Distributions

Table of Contents

  1. Specifying Probabilities
  2. Probability Distribution
  3. Random Variables
  4. Binomial Distribution
  5. Normal Distribution
  6. Expectation of a Random Variable
  7. Standard Deviation of a Random Variable
  8. Variance of a Random Variable
Specifying Probabilities
  • Single Probability
    • The probability of rolling a seven = ⅙
  • Probability Range
    • The probability of rolling 5 through 9 = ⅔
    • The probability of rolling at least 10 = ⅙
  • Probability Distribution
    • The probabilities of rolling 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12 are 1/36, 1/18, 1/12, 1/9, 5/36, 1/6, 5/36, 1/9, 1/12, 1/18, 1/36 respectively.
Probability Distribution
  • A probability distribution is an assignment of probabilities to a set of values.
    • The Rolling Dice distribution assigns probabilities to the possible outcomes of a roll, the integers 2 through 12.
  • Probability distributions
    • have names like
      • Normal, Poisson, Chi-Square, Student T, Binomial, Hypergeometric, Beta, F
    • are discrete or continuous
      • A discrete distribution has gaps between its values
      • A continuous distribution has no gaps between its values, e.g. the normal distribution
      • Rolling Dice is discrete
    • may take input parameters, e.g.
      • the binomial distribution: number of trials and probability of success
      • the normal distribution: mean and standard deviation
      • Rolling Dice has no parameters
    • have properties such as
      • expectation (mean), median, standard deviation, variance, quantile, skewness, kurtosis
      • Rolling Dice
        • Expectation = 7
        • Median = 7
        • Standard deviation = 2.41523
        • 1/4 Quantile = 5
    • sum to 1
      • Rolling Dice
        • 1/36 + 1/18 + 1/12 + 1/9 + 5/36 + 1/6 + 5/36 + 1/9 + 1/12 + 1/18 + 1/36 = 1
  • Individual probabilities and ranges can be calculated from a probability distribution
    • Boxcars
      • Probability[x = 12, Distributed[x, Dice]] = 1/36
    • 7 or 11
      • Probability[x = 7 or x = 11, Distributed[x, Dice]] 2/9
    • x ≥ 10
      • Probability[x ≥ 10, Distributed[x, Dice]] = 1/6
  • A probability distribution is different from a frequency distribution.
    Random Variables
    • Random variables are variables with built-in probability distributions.
    • Let random variable D be the outcome of rolling a pair of dice. Then:
    • A random variable
      • is usually designated by a capital letter
      • is defined by its probability distribution
      • has the properties of its probability distribution
        • discrete or continuous
        • expectation (mean), median, standard deviation, variance, quantile, skewness, kurtosis
    • Random variables are a neat, simple way of doing math involving probability distributions. For example:
      • Let random variable Y = BinomialDistribution[10, 0.2].
      • Let random variable Z = BinomialDistribution[10, 0.8]
        • Then YZ = BinomialDistribution[10, 0.2] x BinomialDistribution[10, 0.8]
      • Thus,
        • If P[Y=4] = 0.0880804 and P[Z=4] = 0.00550502, then
          • P[YZ=4] = 0.0880804 x 0.00550502 = 0.00048488
    Binomial Distribution
    • The binomial distribution gives the probabilities of the possible outcomes of a series of binary tests, called Bernoulli Trials, where the result is either one thing or another: success or failure, heads or tails, right or wrong, 1 or 0.
    • It has two input parameters:
      • n = the number of trials.
      • p = the probability of success (heads, right, 1) on a given trial.
    • Let random variable X = the number of heads in 10 flips of an unbiased coin
      • The values of X are 0 through 10 and assigned probabilities by the Binomial Distribution[n,p], where n = 10 and p = 0.5
    • That is, X ~ Binomial Distribution[10, 0.5]
      • where the tilde means “has probability distribution such and such”
    • This bar chart displays the probabilities assigned to the values of X.
      • The values of X are on the horizontal axis
      • The probabilities of X are on the vertical axis.
    • The binomial distribution for different n and p:

    Increasing n moves the distribution to the right

    Changing p moves the distribution and alters its shape

    • The binomial distribution is discrete, meaning there are gaps between values.
      • P[X = 4] = 0.205078
      • P[X = 4.5] is undefined
      • P[X = 5] = 0. 246093
        • where X ~ Binomial Distribution[10, 0.5]
    • The probabilities of the values of a random variable total one.
      • Probability[x ≥0 and x ≤10, BinomialDistribution[10,0.5]] = 1.0
    • Examples of probabilities derived from the binomial distribution, using Mathematica
      • Probability[x ≥ 0 , BinomialDistribution[10, 0.5]] = 1
      • Probability[x =5 , BinomialDistribution[10, 0.5]] = 0.246094
      • Probability[x ≥ 8 , BinomialDistribution[10, 0.5]] = 0.0546875
      • Probability[x ≥ 4 and x ≤ 6 , BinomialDistribution[10, 0.5]] = 0.65625
      • Probability[x ≥ 8 or x ≤ 2 , BinomialDistribution[10, 0.5]] = 0.109375
    • Random numbers generated by the binomial distribution, using Mathematica
      • RandomVariate[BinomialDistribution[10, 0.5], 50]
        • 6,5,5,6,4,5,4,3,7,5,3,4,8,5,5,6,4,7,0,5,6,5,7,5,4,4,5,3,8,3,3,5,6,4,7,6,6,9,3,5,5,7,2,5,6,4,5,3,4,5
    Normal Distribution
    • The most widely used continuous distribution is the Normal Distribution, a bell-shaped, symmetric distribution that approximates natural quantities such as blood pressure, income, and measurement errors.
    • The Normal Distribution has two input parameters:
      • μ = the mean of the distribution
      • σ = the standard deviation of the distribution
    • A normal distribution that approximates adult male heights in inches:
    • The standard normal distribution is defined by its parameters: μ = 0 and σ = 1:
    • The normal distribution for other μ and σ:

    Increasing σ flattens the distribution

    Changing μ moves the distribution left and right

    • The normal distribution is continuous, meaning there are no gaps between values.
      • Thus:
        •  
    • Examples of probabilities derived from the normal distribution for μ=70 and σ = 4, using Mathematica
      • Probability[x ≥ (12 x 6), NormalDistribution[70, 4]] = 0.31
      • Probability[x ≥ (12 x 6.5), NormalDistribution[70, 4]] = 0.023
      • Probability[x ≥ (12 x 5) and x ≤ (12 x 6), NormalDistribution[70, 4]] = 0.69
    • The probability between two numbers under a continuous distribution is defined as the area under its curve between the numbers.
      • The probability that a random number falls between 0.4 and 0.6 under the standard normal distribution = 0.0703251
    Expectation of a Random Variable
    • The expectation of a random variable is its probability-weighted average, i.e. the sum (or integral) of its probability-weighted values.
    • Expectation of the discrete random variable X, where X ~ Binomial Distribution[10, 0.5]
      • Equals the sum of:
    • Expectation of the continuous random variable X, where X ~ NormalDistribution[5, 2]
    Standard Deviation of a Random Variable
    • The standard deviation of a random variable is the “average” probability-weighted, mathematically-behaved distance of its values from its expectation.
      • (The more intuitive, but mathematically recalcitrant, average is the mean absolute deviation.)
    • The standard deviation is defined as the square root of the variance.
    Variance of a Random Variable
    • The variance of a random variable X is the expectation of (X – the expectation of X)2
      • That is, E(X – E(X))2
      • Alternatively, E(X2) – (E(X))2
    • Variance of the discrete random variable X, where X ~ Binomial Distribution[10, 0.5]
      • The variance of X = the expectation of (X – the expectation of X)2
      • The expectation of X = 5
      • So the variance is the expectation of (X – 5)2
      • Which is the probability-weighted sum of (X – 5)2
      • In mathematical terms:
      • Which is the sum of:
          • = 2.5
    • Variance of the continuous random variable X, where X ~ continuous NormalDistribution[5, 2]
      • E(X – E(X))2
    • Alternatively
      • E(X2) – (E(X))2