Section 3

The Linguistics of Probability (BH Chapter 3)

Random Variables

Formal Definition - A random variable X is a function mapping the sample space S into the real line.

Descriptive Definition - A random variable takes on a numerical summary of an experiment. The randomness comes from the randomness of what outcome occurs. Each outcome has a certain probability. A discrete random variable may only take on a finite (or countably infinite) number of values. Random variables are often denoted by capital letters, usually X and Y.

Distributions

A distribution describes the probability that a random variable takes on certain values. Some distributions are commonly used in statistics because they can help model real life phenomena.

PMF, CDF, and Independence

Probability Mass Function (PMF) (Discrete Only) gives the probability that a random variable takes on the value X.

PX(x)=P(X=x)P_X(x) = P(X=x)

Cumulative Distribution Function (CDF) gives the probability that a random variable takes on the value x or less

P(X=x,Y=y)=P(X=x)P(Y=y)P(X=x, Y=y) = P(X = x)P(Y = y)

Bernoulli Distribution

Bernoulli Distribution The Bernoulli distribution is the simplest case of the Binomial distribution, where we only have one trial, or n = 1. Let us say that X is distributed Bern(p). We know the following:

Story. X “succeeds” (is 1) with probability p, and X “fails” (is 0) with probability 1 − p.

Example. A fair coin flip is distributed Bern(12)Bern(\frac{1}{2})

PMF. The probability mass function of a Bernoulli is:

P(X=x)=px(1p)1xP(X = x) = p^x(1-p)^{1-x}
P(X=x)={p,x=11p,x=0P(X = x) = \begin{cases} p, & x = 1 \\ 1-p, & x = 0 \end{cases}

Binomial Distribution

Binomial Let us say that X is distributed Bin(n, p). We know the following:

Story X is the number of “successes” that we will achieve in n independent trials, where each trial can be either a success or a failure, each with the same probability p of success.

Example If Lebron James makes 10 free throws and each one independently has a 3 chance of getting 4 in, then the number of free throws he makes is distributed Bin(10, 3), or, letting XX be the 4 number of free throws that he makes, XX is a Binomial Random Variable distributed Bin(10,34)Bin(10, \frac{3}{4})

PMF The probability mass function of a Binomial is:

P(X=x)=(nx)px(1p)nxP(X = x) = {n \choose x} p^x(1-p)^{n-x}

Hypergeometric Distribution

Hypergeometric Let us say that X is distributed HGeom(w, b, n). We know the following:

Story In a population of b undesired objects and w desired objects, X is the number of “successes" we will have in a draw of n objects, without replacement.

Example 1) Let’s say that we have only b Weedles (failure) and w Pikachus (success) in Viridian Forest. We encounter n of the Pokemon in the forest, and X is the number of Pikachus in our encounters. 2) The number of aces that you draw in 5 cards (without replacement). 3) You have w white balls and b black balls, and you draw b balls. X is the number of white balls you will draw in your sample.

PMF The probability mass function of a Hypergeometric is:

P(X=k)=(wk)(bnk)(w+bn)P(X = k) = \frac{{w \choose k}{b \choose n-k}}{{w + b \choose n}}

Geometric Distribution

Geometric Let us say that X is distributed Geom(p). We know the following:

Story X is the number of “failures" that we will achieve before we achieve our first success. Our successes have probability p.

Example If each pokeball we throw has a 1 probability to catch Mew, the number of failed pokeballs will be distributed Geom(110)Geom(\frac{1}{10})

PMF With q = 1 − p, the probability mass function of a Geometric is:

P(X=k)=qkpP(X = k) = q^kp