0%

Gaussian Graphical Model

1. Multivariate Gaussian Distribution

Definition 1.1    XV has a multivariate Gaussian distribution with parameters μ and Σ if the joint density is f(xV;μ,Σ)=1(2π)p/2|Σ|1/2exp[12(xVμ)Σ1(xVμ)], where xVRp. Note that we usually consider lnf(xV;μ,Σ)=C12(xVμ)Σ1(xVμ). or lnf(xV;μ,K)=C12(xVμ)K(xVμ).

Theorem 1.1    Let XV has a multivariate Gaussian distribution with concentration matrix K=Σ1. Then XiXjXV{i,j} iff kij=0, where kij is the corresponding entry in the concentration matrix.

2. Gaussian Graphical Model (GGM)

If M is a matrix whose rows and columns are indexed by AV, we write {M}A,A to indicate the matrix indexed by V (i.e., it has |V| rows and columns) whose A,A-entries are M and with zeroes elsewhere. For example, if |V|=3 then M=[abbc]and{M}12,12=[ab0bc0000], where 12 is used as an abbreviation for {1,2} in the subscript.

Lemma 2.1    Let G be a graph with decomposition (A,S,B), and XVNp(0,Σ). Then p(xV) is Markov w.r.t. G iff Σ1={(ΣAS,AS)1}AS,AS+{(ΣBS,BS)1}BS,BS{(ΣS,S)1}S,S, and ΣAS,AS and ΣBS,BS are Markov w.r.t. GAS and GBS respectively.

Proof. Since (A,S,B) is a decomposition and p(xV) is Markov, then XAXBXS, which implies for all xVR|V|, p(xV)p(xS)=p(xA,xS)p(xB,xS). Then xVΣ1xV+xS(ΣS,S)1xS=xAS(ΣAS,AS)1xAS+xBS(ΣBS,BS)1xBS, i.e., xVΣ1xV+xV{(ΣS,S)1}S,SxV=xV{(ΣAS,AS)1}AS,ASxV+xV{(ΣBS,BS)1}BS,BSxV. Therefore, Σ1={(ΣAS,AS)1}AS,AS+{(ΣBS,BS)1}BS,BS{(ΣS,S)1}S,S.

The converse holds similarly.

3. Maximum Likelihood Estimation

Let XV(1),,XV(n)i.i.d.Np(0,Σ). The sufficient statistic for Σ is the sample covariance matrix W=1ni=1nXV(i)(XV(i)). In addition, Σ^=W is the MLE for Σ under the unrestricted model (i.e., when all edges are present in the graph). Let Σ^G denote the MLE for Σ under the restriction that the distribution satisfies the Markov property for G, and K^G denote the inverse of Σ^G.

For a decomposable graph G with cliques C1,,Ck, the MLE can be written in the form (Σ^G)1=i=1k{(WCi,Ci)1}Ci,Cii=2k{(WSi,Si)1}Si,Si.

Example 3.1    Whittaker (1990) analyzed data on five maths test results administered to 88 students, in analysis, algebra, vectors, mechanics and statistics. Some of the entries in the concentration matrix are quite small, suggesting that conditional independence holds. We want to fit a graphical model and check if it gives an excellent fit.

Figure 1: A graph for the maths test data.

By computation, Σ^G is:

Mechanics
Vectors
Algebra
Analysis
Statistics
Mechanics
0.00524
0.00244
0.00287
0
0
Vectors
0.00244
0.01035
0.00561
0
0
Algebra
0.00287
0.00561
0.02849
0.00755
0.00493
Analysis
0
0
0.00755
0.00982
0.00204
Statistics
0
0
0.00493
0.00204
0.00644

We carry out a likelihood ratio test to see whether this model is a good fit to the data. We want to test H0:Restricted modelagainstH1:Unrestricted model. The test statistic is 2(l(Σ^)l(Σ^G))=0.8958244χ(4)2 and the p-value is 0.925>0.05, i.e., the model is a good fit.

Here is the relevant code in R:

library(ggm)
data(marks)

# MLE of the covariance matrix under the unrestricted model.
S = cov(marks)

# Concentration matrix under the restriction.
K = matrix(0, 5, 5)
K[1:3, 1:3] = solve(S[1:3, 1:3])
K[3:5, 3:5] = K[3:5, 3:5] + solve(S[3:5, 3:5])
K[3, 3] = K[3, 3] - 1 / S[3, 3]

# MLE of the covariance matrix under the restriction.
Simga_hat = solve(K)

# Test statistic.
tr = function(x) sum(diag(x))
n = nrow(marks)
test_sta = - n * ((log(det(S)) - log(det(Simga_hat)))
- tr(S %*% (solve(S) - solve(Simga_hat))))
p = pchisq(q=test_sta, df=4, lower.tail=FALSE)