
Genotyping by Sequencing Frequently Asked Questions (FAQ)
What is GBS?
What is GBS useful for?
How can I sign up for the next GBS workshop?
How much DNA is required for GBS?
What quality of DNA is required for GBS?
Can I run a single bar coded sample or less than the full set in a lane?
Is a gel extraction based size selection used in this protocol?
Why do you use ApeKI as opposed to other enzymes?
What enzyme do you recommend for a new species?
Can the bar codes you developed be used with other enzymes?
Is there a software tool available to design bar coded adapters for GBS?
Are the adapters phosphorylated as is done in the original Illumina chemistry?
Why are the adapters not y-adapters like Illumina's?
Is it possible for the adapters to be cut off the genomic fragment during the heat inactivation of the ligase?
During the sample preparation of Plate out adapters, what is the purpose of 'Cover with Airpore tape', to prevent contamination?
Could I use an Eppendorf concentrator instead of a Thermo SpeedVac Concentrator?
Could I use paraffin oil, which is often used in PCR amplification, to replace the rubber sealing mat to prevent loss during the DNA Digestion with ApeKI?
If I don’t have BioMek or TECAN robotics, could I plate the adapters, DNA and other materials with a pipette or a multichannel pipette instead?
In the 96plex GBS Protocol for processing maize, the DNA concentration is 10ng/ul, but in the Supplement for optimizing GBS for other species, the DNA concentration is 100ng/ul. Why the difference?
In the Validation on Experion part of the protocol, you mention that it is important not to have adapter dimers. Can I perform a gel electrophoresis to remove the dimers?
Since the restriction sites of ApeKI is GCWGC, where W is A or T, when you order the single stranded primers for the barcoded adapters and the common adapter, should W be both A, or both T, or randomly A or T?
Can greater than 96 multiplexing be done?
Do I need to order a special barcode adapter and both common adapters for both of the ends of the read I'm sequencing or just one end of the read since this will be paired end reads?
How does GBS differ from RADseq?
Q: What is GBS?
A: Genotyping by sequencing (GBS) is a simple highly-multiplexed system for constructing reduced representation libraries for the Illumina next-generation sequencing platform developed in the Buckler lab by Rob Elshire. It generates large numbers of single nucleotide polymorphisms (SNPs) for use in genetic analyses. Key components of this system are: reduced sample handling, fewer PCR and purification steps, no size fractionation and inexpensive barcoding. We use restriction enzymes to reduce genome complexity and avoid the repetitive fraction of the genome.
Q: What is GBS useful for?
A: GBS has unlimited applications, including breeding, population studies, germplasm characterization, and trait mapping in diverse organisms. Future applications may allow plant breeders to conduct genomic selection on a novel germplasm or species without first having to develop any prior molecular tools, or conservation biologists to determine population structure without prior knowledge of the genome or diversity in the species.
Q: How can I sign up for the next GBS workshop?
A: The lab part of GBS is technically quite simple so currently our workshops focus on the biological and bioinformatics aspects of GBS. The next workshop will be February, 2012. You can register here. If you have other questions about our workshops, contact Theresa Fulton.
Q: How much DNA is required for GBS?
A: We use 100ng of high quality DNA per sample in production. We're testing using 40ng. We are finding that people frequently overestimate the concentration of their DNA. We will re-quantify your DNA upon arrival.
Q: What quality of DNA is required for GBS?
A: Your DNA must be very high molecular-weight (unsheared) and RNA-free. Here is our preferred protocol for Qiagen DNeasy columns, which we know work well.
Q: Can I run a single bar coded sample or less than the full set in a lane?
A: You cannot run a single bar coded sample by itself. The initial bases will all be the same and the Illumina software will not work properly, if at all. Fewer than the whole set can be run, but they must be chosen to have equal base representation and the cut site must be modulated. See our paper for more on this.
Q: Is a gel extraction based size selection used in this protocol?
A: No. Size selection occurs during PCR due to smaller fragments amplifying more efficiently. The fragment size range in libraries resulting from this protocol are suitable for Illumina sequencing.
Q: Why do you use ApeKI as opposed to other enzymes?
A: We had data from an enzyme with the same recognition site in our previous work. This enzyme produces many fragments in the low copy fraction of the maize genome suitable for diversity analysis. We are conducting experiments with other enzymes for applications that do not require so many markers with the aims of lowering cost and increasing the level of multiplexing possible.
Q: What enzyme do you recommend for a new species?
A: We are recommending PstI for all vertebrates, wheat, barley, and some maize. This recommendation is based on genome size and heterozygosity. Species with large genomes and / or heterozygous genomes will have more complete data sets when sequencing a smaller library. PstI produces fewer distinct fragments than ApeKI resulting in a smaller library and better depth of coverage of the library in the sequence data.
Q: Can the bar codes you developed be used with other enzymes?
A: In general, no. These bar codes are specifically designed for use with the ApeKI cut site composition. If you choose a different enzyme, you will need to develop a set of bar codes designed for that enzyme's cut site composition. The adapters have been used with PasI.
Q: Is there a software tool available to design bar coded adapters for GBS?
A: Yes there is. Thomas van Gurp in the Netherlands has developed a web form based adapter generator. We thank him for this helpful addition to the toolbox.
Q: Are the adapters phosphorylated as is done in the original Illumina chemistry?
A: No the adapters are not phosphorylated.
Q: Why are the adapters not y-adapters like Illumina's?
A: Our adapter design is somewhat simpler than the y-adapters, more efficient in annealing and achieve up to specification cluster densities and sequence yield.
Q: Is it possible for the adapters to be cut off the genomic fragment during the heat inactivation of the ligase?
A: No. The adapters have been designed such that they do not recreate the enzyme recognition site and will therefore not be cleaved at that stage.
Q: During the sample preparation of Plate out adapters, what is the purpose of 'Cover with Airpore tape', to prevent contamination?
A: Yes, the purpose is to prevent contamination and allow the water to evaporate
through the tape.
Q: Could I use an Eppendorf concentrator instead of a Thermo SpeedVac Concentrator? Would it work the same?
A: Our speedvac has a swinging rotor for plates. Any concentrator with a similar design should work fine.
Q: Could I use paraffin oil, which is often used in PCR amplification, to replace the rubber sealing mat to prevent loss during the DNA Digestion with ApeKI?
A: I would not recommend using paraffin oil. It will make things more difficult and it could interfere with further reactions. You could use adhesive plate seals if you want. We have done that with success.
Q: If I don’t have BioMek or TECAN robotics, could I plate the adapters, DNA and other materials with a pipette or a multichannel pipette instead?
A: You can do these transfers by hand. The main concerns are carryover, and that differences in pipetting accuracy will result in variation in the number of reads per sample.
Q: In the 96plex GBS Protocol for processing maize, the DNA concentration is 10ng/ul, but in the Supplement for optimizing GBS for other species, the DNA concentration is 100ng/ul. Why the difference?
A: In the 96plex protocol, you should use 100ng total DNA (10ul of 10ng/ul DNA). The samples are pooled prior to the PCR step and there is plenty of template for the PCR to work effectively. Our supplement is based on conducting the PCR on individual samples and for that we use 200ng (2ul of 100ng/ul) of total DNA. If fewer than 200ng is used in the titration experiment, then you will not have enough template for the PCR to produce a discernable library at 18 cycles.
Q: In the Validation on Experion part of the protocol, you mention that it is important not to have adapter dimers. Can I perform a gel electrophoresis to remove the dimers? I realize that some of the DNA of interest will be lost too.
A: You could use gel electrophoresis, but the best way is to perform the titration experiment described in the supplement to determine the correct ratio of DNA to adapters such that you have a good library but no adapter dimers. Then use the determined amount of adapters in the multiplexed protocol.
Q: Since the restriction sites of ApeKI is GCWGC, where W is A or T, when you order the single stranded primers for the barcoded adapters and the common adapter, should W be both A, or both T, or randomly A or T? For example, could I design the barcoded adapter sequence like CTGxxxxAGATCG~~~? And, could one strand of the common adapter be CAGAGATCGGAAGAGCGGTTCAG CAGGAATGCCGAG?
A: When I order the single stranded oligos for the adapters, I specify W in the positions above. The oligo synthesis company provides a roughly even mix of A/T in that position. I would not recommend doing what you have suggested, unless you would like to reduce the number of fragments in the resulting library for some reason.
Q: Can greater than 96 multiplexing be done? I've come across the idea of (but not have seen anything published with) using the 12 illumina barcodes along with 96 custom barcodes in combination for 1000+ barcodes total. Is this something which would be possible with GBS in its current form? Or do you do higher levels of multiplexing simply with more barcodes?
A: With some careful thought, it would be possible to design a 2nd set of bar codes in the Illumina style to achieve what you propose. The main concern is that the Illumina bar coding method uses a 2nd read to get the bar code. The 2nd read (this is also true of paired end runs) loses up to 30% of the data because there are inefficiencies in the cluster regeneration process.
If I were going to need to multiplex many more samples and planned to process tens of thousands of samples, I would just design more bar codes.
Q: Do I need to order a special barcode adapter and both common adapters for both of the ends of the read I'm sequencing or just one end of the read since this will be paired end reads?
A: The adapters that we designed are compatible with both paired end and single end reads. You must have both bar coded and common adapters for clusters to form properly on the flow cell. The second strand sequence must be complementary to the first strand so that they anneal.
Q: How does GBS differ from RADseq?
A: GBS differs from RADseq in some substantial ways. For a good comparison of these types of methods, see Davey et. al. here:
http://www.nature.com/nrg/journal/v12/n7/full/nrg3012.html