3.2 ASFV genome characterization from the Dominican Republic 2021 outbreak
As of November 2021, 73 ASFV samples from the 2021 outbreak in the DR have been characterized by whole genome sequencing on the GridION and PromethION platforms (Oxford Nanopore Technologies) (Supplementary File 1). These 73 samples were obtained from 18 provinces in the DR (Figure 1). All 73 samples contained ASFV sequences that were characterized by p72 as genotype II and appeared to share a high degree of genetic similarity with European strains isolated from 2016-2018 (Figure 2). All sequences from the DR contained four single nucleotide polymorphisms (SNPs) relative to the Georgia 2007/1 genome (NC_044959.2) at positions NC_044959.2:7059 (C->T, MGF 110-1L Trp197Leu), NC_044959.2:44576 (A->G, MGF 505-9R Lys323Glu), NC_044959.2:134514 (T->C, NP419L Asn414Thr), and NC_044959.2:170862 (T->A, I267L Ile195Ile) (Supplementary File 2). Additionally, sequences from the DR did not contain a SNP at position NC_044959.2:26425 (T->C, MGF 360-10L Asn329Thr) that was characteristic of the Asiatic genomes.
Although the DR sequences were most closely related by sequence identity to European ASFV sequences, they appeared to have diverged by some distance from the publicly described European ASFV genomes. The DR sequences did not contain additional SNPs described in European ASFV sequences since 2016 (other than the 4 indicated above), and sequences from the DR contained at least 8 distinct SNPs that were not found in any publicly available ASFV sequence (Supplementary File 2). Additionally, the 73 sequences from the DR clustered into two genetically distinct groups that were differentiated by the presence of a SNP at position NC_044959.2:90280 (G->A, C962R Glu73Glu) (Supplementary File 3). We refer to the two distinct groups within the DR sequences as genetic cluster 1, containing the variant allele at NC_044959.2:90280, and genetic cluster 2, containing the reference allele at position NC_044959.2:90280.
Genetic cluster 1 was the largest cluster (68/73 samples), spanning 17/18 provinces characterized by whole genome sequencing from May 14 to October 4, 2021 (Figure 1). Genetic cluster 2 (5/73 samples) was characterized only in Santiago and Elías Piña provinces early in the outbreak from May 13 to July 20 and had not been detected since in the DR by USDA NVSL FADDL. The two genetic clusters differed by at least 5 SNPs but shared the common genetic backbone present in all sequences obtained from the DR. This suggests that these two clusters were genetically distinct from one another by descent but shared a common ancestor. By the time the first genomes were sequenced from the DR dating to May 13 and May 14, 2021, 1 SNP in cluster 1 and 4 SNPs in cluster 2 had developed away from the putative common ancestral genome. The putative common ancestor was not sequenced in samples obtained from the outbreak.
To verify the accuracy of the variant calls from Nanopore sequencing data, a subset of 14 samples from genetic cluster 1 were additionally sequenced on an Illumina MiSeq platform (Supplementary File 1). All 11 SNPs defining the mutational backbone from the DR sequences plus the defining SNP for genetic cluster 1 were characterized identically between Oxford Nanopore and Illumina platforms. For the samples characterized by both platforms, there were no discrepancies between any SNP calls, and Illumina sequencing did not identify additional SNPs beyond those identified in the Nanopore sequencing data. Based on these results, the variant calls seen in the DR were considered verified by two independent sequencing platforms, and all characterization for additional samples was performed using the Oxford Nanopore GridION and PromethION platforms.
Additionally, within the subset of genetic cluster 1 samples characterized by high accuracy Illumina data, several insertions and deletions were identified in one or more ASFV genomes from the DR (Table 1). Sequence data from this work has been deposited into the NCBI SRA database under BioProject accession PRJNA768333.