Section 2

All Probability is Conditional Probability (BH Chapter 2)

"|" is a given - We read "|" as "given", that is, ABA|B means A occurring given that B occurred.

Conditional Probability - Suppose we observe event B and are interested in the probability of event A occurring given this information. Then,

P(AB)=P(AB)P(B)P(A|B) = \frac{P(A \cap B)}{P(B)}

Bayes’ Rule - This is arguably one of the most important concepts and tools you will learn in this course.

P(AB)=P(BA)P(A)P(B)P(A|B) = \frac{P(B|A)P(A)}{P(B)}

Bridging Conditional Probability and Sets

An intuitive way to visualize conditional probability is to think about the intersection of sets. In order to find the intersection of two different sets A and B, we establish one of these sets to be our sample space and find the likely occurrence of the other set within this established sample space.

P(AB)=P(BA)P(A)=P(AB)P(B)P(A\cap B) = P(B|A)P(A) = P(A|B)P(B)

P(A1A2A3An)=P(A1A2A3An)P(A2A3An)P(An1An)P(An)P(A_1\cap A_2\cap A_3\cap\cdots A_n) = P(A_1|A_2\cap A_3\cap\cdots A_n)P(A_2| A_3\cap\cdots A_n)\cdots P(A_{n-1}|A_n)P(A_n)

Law of Total Probability (LOTP)

A common theme in this course is that it is far easier to solve a problem by breaking it down into smaller, simpler components than tackling it head-on. LOTP is one such tool. Suppose you want to find the probability of some event B, and you can partition the sample space into disjoint events A1,A2,,AnA_1,A_2,\cdots,A_n. Then,

P(B)=i=1nP(BAi)P(Ai)=i=1nP(BAi)P(B) = \sum_{i=1}^n P(B | A_i) P(A_i)\\ = \sum_{i=1}^n P(B \cap A_i)

We often use LOTP with Bayes' rule! Specifically, the denominator of Bayes' rule, P(B)P(B), is often difficult to calculate outright, so we will instead calculate it with LOTP.

Extra Conditioning

Incorporating extra information CC is a simple extension of Bayes' rule and LOTP:

P(AB,C)=P(ABC)P(BC)=P(BA,C)P(AC)P(BC)P(A|B, C) = \frac{P(A \cap B | C)}{P(B|C)} = \frac{P(B|A,C)P(A|C)}{P(B|C)}

P(BC)=i=1nP(BAi,C)P(AiC)=i=1nP(BAiC)P(B|C) = \sum_{i=1}^n P(B | A_i, C) P (A_i |C) = \sum_{i=1}^n P(B \cap A_i |C)

Disjoint vs. Independent

Disjoint, or mutually exclusive, events are events that cannot occur simultaneously. That is, observing event A precludes the possibility of also observing event B. We can state this equivalently as P(AB)=0P(A \cap B) = 0.

Independent events are events such that observing event B yields no information about the possibility of also observing event A. That is, conditioning on observing event B, the probability of observing event A is unchanged:

P(AB)=P(A)P(A|B) = P(A)

We can apply this result to Bayes’ rule and quickly demonstrate an alternative definition of independence:

P(AB)=P(A)P(B)P(A \cap B) = P(A)P(B)

Another form of independence is conditional independence. Two events A and B are said to be conditionally independent given C if P(ABC)=P(AC)P(BC)P(A \cap B | C) = P(A|C) P(B|C)

However, just as pairwise independence does not imply conditional independence, conditional independence does not imply pairwise independence.