Description
Reproducibility in molecular and cellular studies is fundamental to scientific discovery. To establish the reproducibility of a well-defined long term neuronal differentiation protocol, we repeated the cellular and molecular comparison of the same two iPSC lines across five distinct laboratories. Despite uncovering acceptable variability within individual laboratories, we detect poor cross-site reproducibility of the differential gene expression signature between these two lines. Factor analysis identifies the laboratory as the largest source of variation along with several variation-inflating confounds such as passaging effects and progenitor storage. Single cell transcriptomics shows substantial cellular heterogeneity underlying inter-laboratory variability and being responsible for biases in differential gene expression inference. Factor analysis-based normalization of the combined dataset can remove the nuisance technical effects, enabling the execution of robust hypothesis generating studies. Our study shows that multi-center collaborations can expose systematic biases and identify critical factors to be standardized when publishing novel protocols, contributing to increased cross-site reproducibility. Overall design: RNAseq profiles of 57 bulk Human iPSC-Derived Neurons differentiated across five laboratories were generated in triplicates at two different time points and sequenced on 1 lane of HiSeq4000 at 75bp paired end. RNAseq profiles of .... single cells extracted from 2 of the 5 laboratories at the later time point were isolated by FACS onto 96-well plates and sequenced on 1 lane of HiSeq4000 at 75bp paired end.