Definition 1.1 A junction tree is a connected undirected graph without cycles (a tree), which has vertex that consists of subset of , and satisfies the property that if , then every vertex on the unique path from to contains .
Example 1.1 Given sets , , , , , we can build a junction tree:
Equally, we could use a different ordering: , , , , .
Theorem 1.1 If is a junction tree, then its vertices satisfy the running intersection property. Conversely, if satisfy the running intersection property, they can be arranged into a junction tree.
Proof. Suppose is a junction tree. Pick a leaf and remove it to obtain . By the induction hypothesis, the remaining sets, satisfy the running intersection property. Pick to be the unique neighbor of in . By the junction tree property, and for any , then Since , it is obvious that
Suppose satisfy the running intersection property, and assume is a junction tree. Let . We know for all . If , then every vertex from to contains . If , we know . Assume for some , then , which is a contradction. Hence for all , and thus is a junction tree.
Corollary 1.2 If satisfy the running intersection property, then they satisfy the running intersection property starting with any node .
Definition 1.2 We associate each node in junction tree with a potential, which is a function over the variables in the corresponding set. We say that two potentials are consistent if they have the same margin over their intersection, i.e., for and , where .
Theorem 1.3 Suppose satisfy the running intersection property with separator sets , and let for all ( and , by convention). Then the potentials are all consistent iff and for all .
Proof. () If , then for any and , i.e., the potentials are all consistent.
() If , there is nothing to prove. Otherwise, let , then Since all potentials are consistent and , we have and thus i.e., only has cliques, then and for all by the induction hypothesis.
By the running intersection property, , and hence by consistency
Then and thus which only depends on , then i.e., .
The steps to forming a junction tree:
Moralize.
Drop directions.
Triangulate (add edges to get a decomposable graph).
2. Message Passing
Definition 2.1 Suppose we have potentials , and , where . Then passing a message from to involves:
(margin of over the variables in );
.
We have three important properties:
is unchanged because ;
and are consistent since ;
If and are consistent, so are and , since
Hence, updating preserves the joint distribution and does not upset margins that are already consistent.
3. Junction Tree Algorithm
Given a tree, we can pick any vertex as a root, and direct all edges away from it.
Algorithm 1 Collect and Distribute Steps of the Junction Tree Algorithm.
function Collect(rooted tree , potentials ) let be a topological ordering of ; for in do send message from to ; end for return updated potentials end function function Distribute(rooted tree , potentials ) let be a topological ordering of ; for in do send message from to ; end for return updated potentials end function
Theorem 3.1 Let be a junction tree with potentials . Then after running the junction tree algorithm, all potentials will be consistent.
Proof. Since we pass a message from to and by the property of message passing, and are consistent. By property, the consistency between and is preserved after future updates from . Similarly, distribution ensures and are consistent. Hence, all pairs of potentials are consistent.
In practice, message passing is often done in parallel and we have the convergence after at most iterations, where is the length of the longest path of the tree.
Junction graphs in general (loopy belief propagation) also seem to lead to convergence, but it is still an open area of study.
4. Evidence
Denote evidence set , then In junction tree situation, If there is a clique that contains the node (), then we replace with . After running the junction tree algorithm, we have .