"…at this point we haven’t found anything that we can’t sequence with the Genome Analyzer. Wherever the biology leads us, that’s where we’ll go…"
Dr. Brian D. Gregory is interested in understanding how the interplay between epigenetic modification and small RNA expression affects development in Arabidopsis. Using the Genome Analyzer, Dr. Gregory and colleagues generated the most detailed and integrated epigenome map to date for any species. They published their results in Cell in April 2008, and in the spirit of collaboration have made their data publicly available through AnnoJ, their custom browser.
Our group was first interested in looking at the methylome at single base pair resolution. Unlike other next-gen sequencing systems, the Genome Analyzer works really well for sequencing a three base pair genome, which is essentially what you get after bisulfite conversion. Then we decided that because there’s so much small RNA-directed DNA methylation we should layer that type of regulation onto the study as well.
So we sequenced the small RNAome. And then we wanted to answer the ultimate question: how does DNA methylation affect gene expression? So we sequenced the full-length mRNA component of the transcriptome as well. With all this data we could really start to look at functionality of these epigenetic marks and how they lead to differences in regulation of actual gene expression. We could really get to the heart of how these pathways affect the biology of the organism. Without the Genome Analyzer, we couldn’t have data sets with this much depth.
This system is making sequencing available to biologists that never could have done this type of genomic study before. Because we can inexpensively sequence a genome with a couple of runs in a couple of days, we can ask deeper and more genome-wide type of questions.
It’s safe to say that at this point we haven’t found anything that we can’t sequence with the Genome Analyzer. Wherever the biology leads us, that’s where we’ll go and with this system we can go there at single-base resolution. We’re not limited anymore with asking if the resolution is going to be good enough to make this study worthwhile because we can go down to single base resolution.
I find the system to be quite user friendly. The sample prep kits are extremely straight forward; I have never had a failure. The sequencing is definitely walk away. You prep the samples, load them on to the system, hit start, and three days later you have gigabases of genome-wide data.
We average about 45-46 base pair read length and the data quality is very good and we are getting tons of it. The two small RNA sets that we published were, at the time, the largest sets available. As the Genome Analyzer technology keeps getting better, the data sets are going to get even better.
As far as data analysis, as you get used to working with these large data sets it becomes easier. The biggest headache we had was getting used to working with and moving these data sets, but once you have them in a usable form the meta-analysis is just whatever biology you want to take out of your data sets.
There is so much data that you’re obviously not going to be able to get everything that’s there out of it. You’re going to take your piece of the pie, but then the data set is there and it’s available to the rest of the community as well. We generated tons and tons of sequencing data, enough to get two publications out within a year.
Kevin Shianna, Ph.D., has built the Institute for Genome Sciences and Policy Genotyping Facility at Duke University into one of the highest throughput academic genotyping facilities in the United States. An early adopter of Illumina's GoldenGate Genotyping technology, Dr. Shianna now relies on Illumina's broad portfolio of genetic analysis tools to support his institution's research.