In this case, it is pGEM3 DNA sequenced with the T7 primer. This is a typical example of data from a very good sample analyzed by an ABI3130xl DNA Analyzer. The sequencer will continue attempting to "read" this data, but errors become more and more frequent.īelow are three snapshots representing data from progressively later regions in a normal chromatogram: This is normal peaks broaden and shift, making it harder to make them out and call the bases accurately. STEP III - Loss of Resolution Later in the GelĮven normal chromatograms stop providing accurate data after some distance along sequence:Īs the gel progresses, it loses resolution. A comparison of text sequences would probably notify you of the presence of a SNP at this particular location. **Note that the peak was called an 'N' by the basecaller. Both peaks are present, but at roughly half the height they would show if they were homozygous. In this case, one allele carries a C, while the other has a T. Here is a great example of a PCR amplicon from genomic DNA, with a clear heterozygous single-nucleotide polymorphism (SNP). If you want to be sure you've detected all of the polymorphic positions, you should be using a computer program to scan your chromatograms! Realize, too, that it's easy for a human to miss these. **Note that the basecaller may list that base position as an 'N', or it may simply call the larger of the two peaks. This is common when sequencing a PCR product derived from diploid genomic DNA, where polymorphic positions will show both nucleotides simultaneously. No harm done, in this case the sequence is fine.Ī single peak position within a trace may have but two peaks of different colors instead of just one. **Note the extra space between the letters G and A (nt's 271 and 272) corresponding to the mis-spaced peaks just below them. Often, it is ignored by the basecaller, as in this example at right: A common one is a G-A dinucleotide, which leaves a little extra space between them. Some sequencers have predictable errors in base spacing. Nucleotides that have been erroneously inserted into a sequence will often appear to be oddly spaced relative to their neighboring bases, often too close. At the same time, watch for mis-spaced letters in the text sequence along the top. One good way to detect artifacts or errors in a sequencing chromatogram is to scan through it, looking for mis-spaced peaks. Quickly scan the gel for extremely small peaks, 'N' calls, and any mis-spaced peaks or nucleotides. Such mis-calls can occur even in the most error-free regions of the gel. Occasionally, the computer will call an 'N' when a human would be confident in making a more specific basecall. Most often, this occurs when the basecaller calls a specific nucleotide, when the peak really was ambiguous and should have been called as 'N'. Sometimes the computer will mis-call a nucleotide when a human would have identified a different nucleotide. Are there obvious errors in the basecalling?.STEP II - Check for Mis-Called Nucleotides Also, it is impossible to determine the real nucleotide is at 310. Note the multicolored peaks at 271, 273, 279, and the oddly-spaced interstitial peaks near 291 & 301. Now we have an example that has too much baseline noise. The example below has a little baseline noise, but the 'real' peaks are still easy to call, so there's no problem with this sample: Here's an example of an excellent sequence: Note the evenly-spaced peaks and the lack of baseline 'noise' ![]() 'Noise' (baseline) peaks may be present, but with good template and primer they will be quite minimal. Peak heights may vary 3-fold, which is normal. You should see evenly-spaced peaks, each with only one color. How clear are the nucleotide peaks, in general?.STEP I - Get a General Sence of How Clean the Sequence Is This document explains how to examine the normal DNA sequencing chromatogram, describing common issues and how to interpret them. Other errors can show up in the middle, invalidating individual base calls or entire swaths of data. Predictable errors occur near the beginning and again at the end of any sequencing run. ![]() That computer program, however, does make mistakes and it is the client’s responsibility to manually double-check the interpretation of the primary data. Interpretation of Sequencing ChromatogramsĪutomated DNA Sequencers generate a four-color chromatogram showing the results of the sequencing run, as well as a computer program's best guess at interpreting that data - a text file of sequence data. If you are using just the text data, you could be publishing data that is completely invalid! This page explains how to interpret a DNA sequencing chromatogram. In order to obtain good sequencing results, you MUST examine your sequencing chromatogram.
0 Comments
Leave a Reply. |