Description
The effect of drugs, disease and other perturbations on mRNA levels are studied using gene expression microarrays or RNA-seq, with the goal of understanding molecular effects arising from the perturbation. Previous comparisons of reproducibility across laboratories have been limited in scale and focused on a single model. The use of model systems, such as cultured primary cells or cancer cell lines, assumes that mechanistic insights derived with would have been observed via in vivo studies. We examined the concordance of compound-induced transcriptional changes using data from several sources: rat liver and rat primary hepatocytes (RPH) from Drug Matrix (DM) and open TG-GATEs (TG), primary human hepatocytes (HPH) from TG, and mouse liver / HepG2 results from the Gene Expression Omnibus (GEO) repository. Gene expression changes for treatments were normalized to controls and analyzed with three methods: 1) gene level for 9071 high expression genes in rat liver, 2) gene set analysis (GSA) using canonical pathways and gene ontology sets, 3) weighted gene co-expression network analysis (WGCNA). Co-expression networks performed better than genes or GSA on a quantitative metric when comparing treatment effects within rat liver and rat vs. mouse liver. Genes and modules performed similarly at Connectivity Map-style analyses, where success at identifying similar treatments among a collection of reference profiles is the goal. Comparisons between rat liver and RPH, and those between RPH, HPH and HepG2 cells reveal low concordance for all methods. We investigate differences in the baseline state of cultured cells in the context of drug-induced perturbations in rat liver and highlight the striking similarity between toxicant-exposed cells in vivo and untreated cells in vitro.