SARS-CoV-2 Lineage Variant Summary

Primers that do not overlap with variants associated with these lineages are shown in blue below a schematic SARS-CoV-2 genome. Primers that overlap with enough sufficiently-frequent variants to be potentially impacted (see below) are indicated in orange. Relevant variants are shown in gray directly below the genome, above the primer sets.

The score listed for a variant is the percent of sequences it occurs in for the selected lineage set. The score listed for a primer is determined by the scoring algorithm described below.

Lineage:


Show primer sets:




Calculating variant impact:

Detailed methods for initial processing and analysis of sequences and primer sets are found in the Nextflow pipelines in our GitHub repository.


To distinguish primers that are likely to be actually impacted from ones affected by a rare variant or a sequencing error, impact is determined by both the the number of times each variant appears in the data, and the number, length, and position of variants overlapping a primer. To eliminate sequencing errors and other low-frequency variants that will likely not have a significant impact on primers, a variant is only considered in the calculation if it occurs either in at least 1% of sequences (and no less than 10 sequences) in the dataset (after filtering by lineage). Primers are then scored based on the length and position of overlapping variants, and labeled as affected if this results in a sufficiently high score. To focus on recent mutations in the SARS-CoV-2 virus, only variants from the last 180 days are considered.


The composite score for a primer is the sum of all overlapping variant scores. Variant scores are based on length, whether they are indels or mismatches, and position in the primer. Since long variants are more likely to disrupt primer binding, an exponentially-increasing length penalty is added. To account for variant position and indels, a flat multiplier is applied to the score for the relevant bases. Indels and variants 3-5 nt in from (but not at) the 3' end of a primer have the highest multipliers, as these traits empirically relate primer disruption. Variants ≤3 nt from the 5' end of the primer, or ≤2 nt from the 3' end have multipliers that reduce the score compared to the default for any other position. We consider variants at these positions to be less likely to affect polymerase binding. For 5' variants, this is because it creates a flap on the 5' end, which interferes less with amplification. For variants at the 3' end, we assume that 3’-5’ exonuclease activity in commonly used polymerases can remove mismatched bases. While we attempt to model the observed effects of variants observed, it is likely imperfect. Proposed variant impacts should be assessed experimentally as we did in SARS-CoV-2 for early omicron, BQ.1 associated variation and more generally for LAMP applications