This string of 30,000 letters (the A, T, C and Gs of the genetic code) marked day one in the race to understand the genetics of this newly discovered coronavirus.
Now, a further 100,000 coronavirus genomes sampled from COVID-19 patients in over 100 countries have joined Wuhan-1. Geneticists around the world are mining the data for answers.
Where did SARS-CoV-2 come from? When did it start infecting humans? How is the virus mutating – and does it matter? SARS-CoV-2 genomics, much like the virus itself, went big and went global.
The term 'mutation' tends to conjure up images of dangerous new viruses with enhanced abilities sweeping across the planet.
And while mutations constantly emerge and sometimes sweep – early mutations in SARS-CoV-2 have made their way around the world as the virus spread almost unnoticed – mutations are a perfectly natural part of any organism, including viruses. The vast majority have no impact on a virus's ability to transmit or cause disease.
A mutation just means a difference; a letter change in the genome. While the SARS-CoV-2 population was genetically essentially invariant when it jumped into its first human host in late 2019, over 13,000 of these changes are now found in the 100,000 SARS-CoV-2 sequenced to date.
Yet any two viruses from any two patients anywhere in the world differ on average by only ten letters. This is a tiny fraction of the total 30,000 characters in the virus's genetic code and means that all SARS-CoV-2 in circulation can be considered part of a single clonal lineage.
It will take some time for the virus to acquire substantial genetic diversity. SARS-CoV-2 mutates fairly slowly for a virus, with any lineage acquiring a couple of changes every month; two to six-fold lower than the number of mutations acquired by influenza viruses over the same period.
Still, mutations are the bedrock on which natural selection can act. Most commonly mutations will render a virus non-functional or have no effect whatsoever. Yet the potential for mutations to affect transmissibility of SARS-CoV-2 in its new human hosts exists.
As a result, there have been intense efforts to determine which, if any, of the mutations identifiable since the first SARS-CoV-2 genome was sequenced in Wuhan may significantly alter viral function.
An infamous mutation in this context is an amino acid change in the SARS-CoV-2 spike protein, the protein that gives coronaviruses their characteristic crown-like projections and allows it to attach to host cells.
This single character change in the viral genome – termed D614G – has been shown to increase virus infectivity in cells grown in the lab, though with no measurable impact on disease severity. Although this mutation is also near systematically found with three other mutations, and all four are now found in about 80 percent of sequenced SARS-CoV-2 making it the most frequent set of mutations in circulation.
The challenge with D614G, as with other mutations, is disentangling whether they have risen in frequency because they happened to be present in viruses responsible for seeding early successful outbreaks, or whether they truly confer an advantage to their carriers.
Simply carried along
D614G is not the only mutation found at high frequency. A string of three mutations in the protein shell of SARS-CoV-2 are also increasingly appearing in sequencing data and are now found in a third of viruses.
A single change at position 57 of the Orf3a protein, a known immunogenic region, occurs in a quarter.
Other mutations exist in the spike protein while myriad others seem induced by the activity of our own immune response. At the same time, there remains no consensus that these, or any others, are significantly changing virus transmissibility or virulence.
Most mutations are simply carried along as SARS-CoV-2 continues to successfully spread.
But replacements are not the only small edits that may affect SARS-CoV-2. Deletions in the SARS-CoV-2 accessory genes Orf7b/Orf8 have been shown to reduce the virulence of SARS-CoV-2, potentially eliciting milder infections in patients.
A similar deletion may have behaved in the same way in SARS-CoV-1, the related coronavirus responsible for the SARS outbreak in 2002-04. Progression towards a less virulent SARS-CoV-2 would be welcome news, though deletions in Orf8 have been present from the early days of the pandemic and do not seem to be increasing in frequency.
While adaptive changes may yet occur, all the available data at this stage suggests we're facing the same virus since the start of the pandemic.
Chris Whitty, chief medical officer for England, was right to pour cold water on the idea that the virus has mutated into something milder than the one that caused the UK to impose a lockdown in March.
Possible decreases in symptom severity seen over the summer are probably a result of younger people being infected, containment measures (such as social distancing) and improved treatment rather than changes in the virus itself.
However, while SARS-CoV-2 has not significantly changed to date, we continue to expand our tools to track and trace its evolution, ready to keep pace.