Data for Genomic signatures of germline gene expression

This webpage provides access to processed expression data for a wide variety of human germline and somatic tissues. We used these data in our paper:

McVicker G, Green P. 2010.Genomic signatures of germline gene expression. Genome Res 20:1503-11

Note that we collected and processed these data, but did not generate them ourselves. If you use these data, please cite the original studies that generated them (see below).

We collected the expression data from 12 studies, which used either the Affymetrix hgu133plus2 or hgu133A microarray. The intensity data were background adjusted, normalized and summarized together using the RMA algorithm (Bolstad et al. 2003; Irizarry et al. 2003a). Expression values were assigned to UCSC known genes using Affymetrix's probeset annotation file, as described in detail in the paper.

If you would like additional data from the paper or have any questions, please contact Graham McVicker at (first_initial)(last_name)@uchicago.edu.

FileDescription
expr_sample_summary.txt.gz A summary of each expression experiment including the primary reference, microarray type, and tissue sample
combined.rma.gene.expr.txt.gz Combined expression data from both microarray types, using only probes common to both platforms
hu133a.rma.gene.expr.txt.gz Expression data from only the hu133a microarray platform
hu133plus2.rma.gene.expr.txt.gz Expression data from only the hu133plus2 microarray platform. This dataset has expression values for more genes but fewer tissues. Note that the column headings in this file require some relabelling to be consistent with those from the other files.
expr.entropy.txt.gz Shannon entropy of each gene's expression across tissues (after merging replicates). Genes with low entropy have more tissue-specific patterns of gene expression (see also Schug et al. 2005)


Primary references for expression data