Introduction

Instead of discussing an independence between events, we will focus on the generalization of independence to sets of random variables.

Let’s start with key definitions.

Definitions

Def (Conditional Independence): \(X \perp Y \mid Z \Leftrightarrow P(X,Y \mid Z) = P(X\mid Z) P(Y \mid Z)\)

Note: We use \(\perp\) symbol instead of \(\perp \!\!\! \perp\) just for simplicity

Proposition: if \((X \perp Y \mid Z)\) then \(P(X \mid YZ) = P(X \mid Z)\)

Proof: By definition of Conditional Independence we have:

\[P(X,Y \mid Z) = P(X\mid Z) P(Y \mid Z)\]

On the other hand we may ise chain rule :

\[P(X,Y \mid Z) = P(X\mid Y,Z) P(Y \mid Z)\]

Now we can compare both sides and it is clear that \(P(X \mid YZ) = P(X \mid Z)\)#

Properties

1.Symmetry: \((X \perp Y \mid Z) \Rightarrow (Y \perp X \mid Z)\)

2.Decomposition:

\((X \perp Y, W \mid Z) \Rightarrow (X \perp Y \mid Z)\) or \((X \perp Y, W \mid Z) \Rightarrow (X \perp W \mid Z)\)

Proof

\((X \perp Y, W \mid Z) \Rightarrow \\ \Rightarrow P(X, Y, W \mid Z) =P(X \mid Z)\cdot P(Y,W \mid Z)\Leftrightarrow \\ \Leftrightarrow \sum\limits_{W}P(X, Y, W \mid Z) = \sum\limits_{W} P(X \mid Z)\cdot P(Y,W \mid Z)\Leftrightarrow \\ \Leftrightarrow P(X, Y\mid Z) = P(X \mid Z)\sum\limits_{W} P(Y,W \mid Z) \Leftrightarrow \\ \Leftrightarrow P(X, Y\mid Z) = P(X \mid Z)P(Y \mid Z) \Rightarrow \\ \Rightarrow (X \perp Y \mid Z)\) #

3.Weak union

\((X \perp Y, W \mid Z) \Rightarrow(X \perp Y \mid Z, W)\)

Proof

Using Chain rule(1), Decomposition property(2) and proposition of Conditional Independency (3), we may write:

\(P(X,Y \mid W, Z) \stackrel{(1)}{=}P(X \mid W, Z, Y) P(Y \mid W, Z) \stackrel{(2)}{=} \\ \stackrel{(2)}{=} P(X \mid Z) P(Y \mid W, Z) \stackrel{(3)}{=} P(X \mid W, Z) P(Y \mid W, Z)\) #

4.Contraction

\((X \perp W \mid Z, Y) \&(X \perp Y \mid Z) \Rightarrow(X \perp Y, W \mid Z)\)

Proof:

\((X \perp W \mid Z, Y) \Rightarrow \\ \Rightarrow P(X, W\mid Z, Y) = P(X\mid Z, Y) \cdot P(W\mid Z,Y)\Leftrightarrow \\ \Leftrightarrow \frac{P(X,W,Y \mid Z)}{P(Y \mid Z)}=\frac{P(X,Y \mid Z)}{P(Y \mid Z)}\frac{P(W,Y \mid Z)}{P(Y \mid Z)}\Leftrightarrow \\ \Leftrightarrow \frac{P(X,W,Y \mid Z)}{P(Y \mid Z)}=\frac{P(X \mid Z)P(Y \mid Z)}{P(Y \mid Z)}\frac{P(W,Y \mid Z)}{P(Y \mid Z)}\Leftrightarrow \\ \Leftrightarrow \frac{P(X,W,Y \mid Z)}{P(Y \mid Z)}=P(X \mid Z)\frac{P(W,Y \mid Z)}{P(Y \mid Z)}\Leftrightarrow \\ \Leftrightarrow P(X,W,Y \mid Z)=P(X \mid Z)P(W,Y \mid Z)\Rightarrow \\ \Rightarrow X \perp W,Y \mid Z\) #

5.Intersection

\((X \perp Y \mid Z, W) \&(X \perp W \mid Z, Y) \Longrightarrow(X \perp Y, W \mid Z)\)

Proof:

Using independency, we may write: \(P(X \mid Z, W) = P(X \mid Z, W, Y) = P(X \mid Z, Y)\Leftrightarrow \\ \Leftrightarrow \frac{P(X, W \mid Z)}{P(W \mid Z)} = \frac{P(X,Z \mid Y)}{P(Y \mid Z)}\Leftrightarrow \\ \Leftrightarrow P(X, W \mid Z)P(Y \mid Z) = P(X,Z \mid Y)P(W \mid Z)\)

Marginalizing over \(W\) gives us: \(P(X \mid Z) P(Y \mid Z) = P(X, Y \mid Z)\) or \((X \perp Y \mid Z)\)

After applying contraction property (see above) we get: \((X \perp Y, W \mid Z)\) #

Conclusion

The proofs of Contraction and Intersection were hard for me. Solutions were found on [3]

References

D. Koller and N. Friedman, Probabilistic graphical models: principles and techniques. Cambridge, MA: MIT Press, 2009.
Probabilistic Graphical Models 1: Representation, Conditional Independence
Proofs for conditiona independence properties

Last update:09 February 2021

How deep is rabbit hole?

Conditional Independence and it's properties

Introduction

Definitions

Properties

1.Symmetry: \((X \perp Y \mid Z) \Rightarrow (Y \perp X \mid Z)\)

2.Decomposition:

Proof

3.Weak union

Proof

4.Contraction

Proof:

5.Intersection

Proof:

Conclusion

References

Conditional Independence and it's properties

Introduction

Definitions

Properties

1.Symmetry: \((X \perp Y \mid Z) \Rightarrow (Y \perp X \mid Z)\)

2.Decomposition:

Proof

3.Weak union

Proof

4.Contraction

Proof:

5.Intersection

Proof:

Conclusion

References

You may also enjoy...