github link
Accession IconSRP097877

An RNA-seq dataset for studies of gene expression variation in the MAGIC line resource of Arabidopsis thaliana

Organism Icon Arabidopsis thaliana
Sample Icon 199 Downloadable Samples
Technology Badge IconIllumina HiSeq 2000

Submitter Supplied Information

Description
To understand the population genetics of structural variants (SVs), and their effects on phenotypes, we developed an approach to mapping SVs, particularly transpositions, segregating in a sequenced population, and which avoids calling SVs directly. The evidence for a potential SV at a locus is indicated by variation in the counts of short-reads that map anomalously to the locus. These SV traits are treated as quantitative traits and mapped genetically, analogously to a gene expression study. Association between an SV trait at one locus and genotypes at a distant locus indicate the origin and target of a transposition. Using ultra-low-coverage (0.3x) population sequence data from 488 recombinant inbred Arabidopsis genomes, we identified 6,502 segregating SVs. Remarkably, 25% of these were transpositions. Whilst many SVs cannot be delineated precisely, PCR validated 83% of 44 predicted transposition breakpoints. We show that specific SVs may be causative for quantitative trait loci for germination, fungal disease resistance and other phenotypes. Further we show that the phenotypic heritability attributable to sequence anomalies differs from, and in the case of time to germination and bolting, exceeds that due to standard genetic variation. Gene expression within SVs is also more likely to be silenced or dysregulated, as inferred from RNA-seq data collected from a subset of just over 200 of the MAGIC lines. This approach is generally applicable to large populations sequenced at low-coverage, and complements the prevalent strategy of SV discovery in fewer individuals sequenced at high coverage. Overall design: 209 samples consisting of different inbred lines from the Multiparent Advance Generation InterCross (MAGIC) population in the reference plant, Arabidopsis thaliana. For each sample, RNA was collected from the aerial shoot at the 4th true leaf stage, and Illumina mRNA-seq libraries were constructed (a single library was constructed with each line; that is, each MAGIC line is represented by one biological replicate). Using these libraries, which were non-stranded, paired-end 100 bp RNA-seq Illumina reads were generated for each sample, and used to quantify gene expresison in each MAGIC line. The resulting expression phenotypes are suitable for describing the impacts of genetic variation in the MAGIC line founders on the control of gene expression.
PubMed ID
Total Samples
209
Submitter’s Institution
No associated institution

Samples

Show of 0 Total Samples
Filter
Add/Remove
Accession Code
Title
Subject
Processing Information
Additional Metadata
No rows found
Loading...