Conditional Independence and it's properties
Written on February 8th, 2021 by Sergei SemenovIntroduction
Instead of discussing an independence between events, we will focus on the generalization of independence to sets of random variables.
Let’s start with key definitions.
Definitions
Def (Conditional Independence): \(X \perp Y \mid Z \Leftrightarrow P(X,Y \mid Z) = P(X\mid Z) P(Y \mid Z)\)
Note: We use \(\perp\) symbol instead of \(\perp \!\!\! \perp\) just for simplicity
Proposition: if \((X \perp Y \mid Z)\) then \(P(X \mid YZ) = P(X \mid Z)\)
Proof: By definition of Conditional Independence we have:
\[P(X,Y \mid Z) = P(X\mid Z) P(Y \mid Z)\]On the other hand we may ise chain rule :
\[P(X,Y \mid Z) = P(X\mid Y,Z) P(Y \mid Z)\]Now we can compare both sides and it is clear that \(P(X \mid YZ) = P(X \mid Z)\)#
Properties
1.Symmetry: \((X \perp Y \mid Z) \Rightarrow (Y \perp X \mid Z)\)
2.Decomposition:
\((X \perp Y, W \mid Z) \Rightarrow (X \perp Y \mid Z)\) or \((X \perp Y, W \mid Z) \Rightarrow (X \perp W \mid Z)\)
Proof
\((X \perp Y, W \mid Z) \Rightarrow \\ \Rightarrow P(X, Y, W \mid Z) =P(X \mid Z)\cdot P(Y,W \mid Z)\Leftrightarrow \\ \Leftrightarrow \sum\limits_{W}P(X, Y, W \mid Z) = \sum\limits_{W} P(X \mid Z)\cdot P(Y,W \mid Z)\Leftrightarrow \\ \Leftrightarrow P(X, Y\mid Z) = P(X \mid Z)\sum\limits_{W} P(Y,W \mid Z) \Leftrightarrow \\ \Leftrightarrow P(X, Y\mid Z) = P(X \mid Z)P(Y \mid Z) \Rightarrow \\ \Rightarrow (X \perp Y \mid Z)\) #
3.Weak union
\((X \perp Y, W \mid Z) \Rightarrow(X \perp Y \mid Z, W)\)
Proof
Using Chain rule(1), Decomposition property(2) and proposition of Conditional Independency (3), we may write:
\(P(X,Y \mid W, Z) \stackrel{(1)}{=}P(X \mid W, Z, Y) P(Y \mid W, Z) \stackrel{(2)}{=} \\ \stackrel{(2)}{=} P(X \mid Z) P(Y \mid W, Z) \stackrel{(3)}{=} P(X \mid W, Z) P(Y \mid W, Z)\) #
4.Contraction
\((X \perp W \mid Z, Y) \&(X \perp Y \mid Z) \Rightarrow(X \perp Y, W \mid Z)\)
Proof:
\((X \perp W \mid Z, Y) \Rightarrow \\ \Rightarrow P(X, W\mid Z, Y) = P(X\mid Z, Y) \cdot P(W\mid Z,Y)\Leftrightarrow \\ \Leftrightarrow \frac{P(X,W,Y \mid Z)}{P(Y \mid Z)}=\frac{P(X,Y \mid Z)}{P(Y \mid Z)}\frac{P(W,Y \mid Z)}{P(Y \mid Z)}\Leftrightarrow \\ \Leftrightarrow \frac{P(X,W,Y \mid Z)}{P(Y \mid Z)}=\frac{P(X \mid Z)P(Y \mid Z)}{P(Y \mid Z)}\frac{P(W,Y \mid Z)}{P(Y \mid Z)}\Leftrightarrow \\ \Leftrightarrow \frac{P(X,W,Y \mid Z)}{P(Y \mid Z)}=P(X \mid Z)\frac{P(W,Y \mid Z)}{P(Y \mid Z)}\Leftrightarrow \\ \Leftrightarrow P(X,W,Y \mid Z)=P(X \mid Z)P(W,Y \mid Z)\Rightarrow \\ \Rightarrow X \perp W,Y \mid Z\) #
5.Intersection
\((X \perp Y \mid Z, W) \&(X \perp W \mid Z, Y) \Longrightarrow(X \perp Y, W \mid Z)\)
Proof:
Using independency, we may write: \(P(X \mid Z, W) = P(X \mid Z, W, Y) = P(X \mid Z, Y)\Leftrightarrow \\ \Leftrightarrow \frac{P(X, W \mid Z)}{P(W \mid Z)} = \frac{P(X,Z \mid Y)}{P(Y \mid Z)}\Leftrightarrow \\ \Leftrightarrow P(X, W \mid Z)P(Y \mid Z) = P(X,Z \mid Y)P(W \mid Z)\)
Marginalizing over \(W\) gives us: \(P(X \mid Z) P(Y \mid Z) = P(X, Y \mid Z)\) or \((X \perp Y \mid Z)\)
After applying contraction property (see above) we get: \((X \perp Y, W \mid Z)\) #
Conclusion
The proofs of Contraction and Intersection were hard for me. Solutions were found on [3]
References
- D. Koller and N. Friedman, Probabilistic graphical models: principles and techniques. Cambridge, MA: MIT Press, 2009.
- Probabilistic Graphical Models 1: Representation, Conditional Independence
- Proofs for conditiona independence properties
Last update:09 February 2021