Inequalities and Markov Chains (BH Chapter 10 & 11)
Statistical Inequalities
Markov Chains
What makes such a sequence of random variables a Markov Chain is the Markov Property, which says that if you want to predict where the chain is at at a future time, you only need to use the present state, and not any past information. In other words, the given the present, the future and past are conditionally independent.
Mathematically, this says:
State Properties
A state is either recurrent or transient.
If you start at a Recurrent State, then you will always return back to that state at some point in the
future. You can leave, but you’ll always return at some point.
Otherwise you are at a Transient State. There is some probability that once you leave you will never return. There’s a chance that you’ll leave and never come back
A state is either periodic or aperiodic.
If you start at a Periodic State of period k, then the GCD of all of the possible number steps it would take to return back is k (which should be > 1).
Otherwise you are at an Aperiodic State. The GCD of all of the possible number of steps it would take to return back is 1.
Transition Matrix
Chain Properties
A chain is irreducible if you can get from anywhere to anywhere. An irreducible chain must have all of its states recurrent. A chain is periodic if any of its states are periodic, and is aperiodic if none of its states are periodic. In an irreducible chain, all states have the same period.
Stationary Distribution
If a Markov Chain is irreducible, then it has a unique stationary distribution. In addition, all entries of this stationary distribution are non-zero (which could have been inferred from the fact that all states are recurrent).
A Markov Chain is a walk along a discrete state space {1, 2, . . . , M}. We let Xt denote which element of the state space the walk is on at time t. The Markov Chain is the set of random variables denoting where the walk is at all points in time, {X0, X1, X2, . . . }. Each Xt takes on values that are in the state space, so if X1 = 3, then at time 1, we are at state 3.
In words: Given that my history of states has been i0,i2…in, the distribution of where my next state will be doesn’t depend on any of that history besides in, the most recent state.
Element qij in square transition matrix Q is the probability that the chain goes from state i to state j, or more formally:
qij=P(Xn+1=j∣Xn=i)
To find the probability that the chain goes from state i to state j in m steps, take the (i, j)-th} element of Qm.
qij(m)=P(Xn+m=j∣Xn=i)
If X0 is distributed according to row-vector PMF p (e.g. pj=P(X0=ij)), then the marginal PMF of Xn is pQn.
A chain is reversible with respect to s if siqij=sjqji for all i,j. A reversible chain running on s is indistinguishable whether it is running forwards in time or backwards in time. Examples of reversible chains include random walks on undirected networks, or any chain with qij=qji, where the Markov chain would be stationary with respect to s=(M1,M1,…,M1).
Reversibility Condition Implies Stationarity - If you have a PMF s on a Markov chain with transition matrix Q, then siqij=sjqji for all i,j implies that s is stationary.
Let us say that the vector p=(p1,p2,…,pM) is a possible and valid PMF of where the Markov Chain is at at a certain time. We will call this vector the stationary distribution, s, if it satisfies sQ=s.
As a consequence, if Xt has the stationary distribution, then all future Xt+1,Xt+2,… also has the stationary distribution.
Example: In the Gambler's Ruin problem, which is not irreducible, what ultimately happens to the chain can either be that one's money is always 0 or always N.
If a Markov Chain is irreducible and aperiodic, then it has a unique stationary distribution s and the chain converges to the stationary distribution. limn→∞P(Xn=i)=si
Example: Imagine a Markov chain which is just a cycle, and hence is periodic. Then, depending on where we start, P(Xn=i) will be either 0 or 1 deterministically, and surely won't converge to the stationary distribution, which is uniform across all nodes in the cycle.
If a Markov Chain is irreducible and aperiodic and the stationary distribution exists and is unique, then the expected number of steps to return back to i starting from i is 1/si, where si is the long-run probability of a chain being at state i. To solve for the stationary distribution, you can solve for sQ=s or (QT−I)sT=0. The stationary distribution is uniform if the columns of Q sum to 1.
If you have a certain number of nodes with undirected edges between them, and a chain can pick any edge uniformly at random and move to another node, then this is a random walk on an undirected network. The stationary distribution can be easily calculated. Let di be the degree of the i-th node, meaning the number of edges connected to this node. Then, we have: