Single Graphical Lasso with Latent Variables

Single Graphical Lasso with Latent Variables#

This tutorial demonstrates how to estimate a sparse inverse covariance matrix while accounting for latent variables using the Sparse + Low-Rank (SLR) Graphical Lasso model. This approach decomposes the inverse covariance matrix into a sparse component (direct dependencies) and a low-rank component (unobserved latent effects).

Purpose: Separates direct sparse interactions from systematic low-rank structure

Key characteristics:

Decomposes precision matrix into sparse + low-rank components
Sparse component: direct microbial interactions
Low-rank component: systematic variation (environmental gradients, batch effects)
Uses both L1 penalty (λ₁) and nuclear norm penalty (μ₁)

Interpretation:

Sparse component: True direct microbial associations
Low-rank component: Global patterns affecting multiple taxa simultaneously
Better separation of biological signal from confounding factors
More conservative edge detection compared to SGL

Step 1: Estimate a Sparse + Low-Rank Model#

We begin by fitting a model that includes both sparse and low-rank components to account for hidden (latent) structure:

# slr model
qiime gglasso solve-problem \
    --p-n-samples 50 \
    --p-lambda1-min 0.001 \
    --p-lambda1-max 1 \
    --p-mu1-min 0.50118723 \
    --p-mu1-max 0.79432823 \
    --p-n-lambda1 50 \
    --p-n-mu1 50 \
    --p-gamma 0.01 \
    --p-latent True \
    --i-covariance-matrix data/atacama-table-corr.qza \
    --o-solution data/atacama-solution-slr.qza \
    --verbose

Explanation:

--p-n-samples 50: Number of individuals used to compute the input covariance matrix.
--p-lambda1-min: Lower-bound for the sparsity penalty (λ₁).
--p-lambda1-max: Upper-bound for the sparsity penalty (λ₁).
--p-mu1-min: Lower-bound for the low-rank penalty (μ₁).
--p-mu1-max: Upper-bound for the low-rank penalty (μ₁).
--p-n-lambda1: Number of grid points between the min and max lambda values.
--p-n-mu1: Number of grid points for each penalty parameter.
--p-gamma 0.01: Controls the model selection criterion (e.g., eBIC).
--p-latent True: Enables the latent variable modeling (SLR).
--i-covariance-matrix: Input covariance matrix in QIIME 2 format.
--o-solution: Output artifact containing the estimated sparse + low-rank decomposition of the inverse covariance matrix.

Step 2: Visualize the Estimated Network#

The estimated network, including effects of latent variables, can be visualized using:

# visualize the results
qiime gglasso summarize \
    --i-solution data/atacama-solution-slr.qza \
    --p-label-size 25pt \
    --o-visualization data/slr-summary.qzv

Explanation:

Generates an interactive QIIME 2 visualization of the inferred network structure.
--p-label-size 25pt: Sets the font size of node labels in the network plot.
The output .qzv file can be viewed using QIIME 2 View.

Single Graphical Lasso with Latent Variables

Contents

Single Graphical Lasso with Latent Variables#

Step 1: Estimate a Sparse + Low-Rank Model#

Step 2: Visualize the Estimated Network#