Sample similarity analyses

Different sample similariy metrics and methods are applied to the dimension reduced expression module data and metadata, respectively. Application of downstream analyses on aggregated data instead of single gene data was shown to increase representativeness and reduce noisiness.

Correlation Network

Correlation network represents module (p.1) and metagene (p.2) data as graph with cells as nodes connected if their mutual correlation exceeds a given threshold (epsilon-neighborhood).

Correlation Spanning Tree

Correlation spanning tree represents module (p.1) and metagene (p.2) data as graph with cells as nodes connected to a spanning tree of maximal mutual correlation between connected nodes.

Supervised and clustered heatmaps

Heatmaps of module (p.1-2) and metagene (p.3-4) expression data with supervised and hierarchically clustered sample ordering.

Independent Component Analysis

Independent component analysis (ICA) distributes cells along axes of most variability similar to principal component analysis. However, restriction to othogonal axes is omitted in ICA. ICA is applied to module (p.1-2) and metagene (p.3-4) expression data, where the first three components are shown in 3d and pairwise 2d scatterplots.

t-SNE

t-distributed stochastic neighbor embedding (t-SNE) is a nonlinear dimensionality reduction technique projecting cells into a two-dimensional coordinate system. It is applied to module (p.1) and metagene (p.2) data.