Scientists have pinpointed precise regions in the human genome where DNA is most likely to develop a mutation.
At spots where RNA polymerase 'opens' your DNA to read and copy instructions – known as transcription start sites – your genome is especially vulnerable to damage and the occasional imperfect repair that can lead to permanent changes. Scientists call these locations 'mutation hotspots', and they may be crucial for understanding genetic disease.
"These sequences are extremely prone to mutations and rank among the most functionally important regions in the entire human genome, together with protein-coding sequences," says geneticist Donate Weghorn of the Centre for Genomic Regulation in Spain.
Related: Humans Are Evolving in Front of Our Eyes on The Tibetan Plateau
Genetic mutation often occurs when damaged DNA is unable to repair itself correctly, resulting in a small but irreversible change to the genome.
Most mutations are benign and have no effect on health or development. Rarely, mutations can be beneficial. Advantageous mutations drive evolution and life's ability to adapt to changing circumstances.

But harmful mutations can cause real problems – and even be passed on to offspring.
An estimated 300 million people around the world are affected by rare genetic disorders. Understanding the human genome's susceptibility to mutation is vital for developing accurate models for studying these disorders.
Genome damage increases dramatically during a process called transcription, where your DNA is copied into messenger molecules called RNA.
Think of your genome as a cookbook, and a gene as a recipe in that cookbook. The RNA polymerase opens the cookbook to copy a recipe onto a post-it note – the RNA – before the book snaps shut. Damage can occur as a result of these actions, up to hundreds of thousands of times per cell per day.
Weghorn and his colleagues wanted to know if this extra wear and tear at the places where transcription begins leads to a heightened rate of imperfect repair – the kind of repair that becomes permanent genetic mutations.
To test this idea, the researchers dug into enormous human genome datasets, tracking extremely rare mutations (ERVs) across nearly 15,000 genes in more than 220,000 individuals. These are inheritable mutations that have persisted for several generations.
They also examined data from 10 "trio" studies. These are studies in which the genomes of a father, mother, and their shared child are sequenced to identify mutations that the child didn't inherit, called de novo mutations (DNMs). In other words, the mutation occurred randomly in the sperm, egg or after they fused.
In people with ERVs, the researchers found a very strong and consistent mutation hotspot around transcription start sites. To refer back to the cookbook analogy, it's as though the chef tore the page when opening it, or dropped sauce on the page, and repairing that damage obscured or altered part of the ingredient list.
However, in the DNM studies, that hotspot mysteriously vanished. If it really existed, it should have appeared in new mutations as well as inherited ones.
The answer came from 11 previously published studies on mosaic mutations, which arise during the first stages of cell division after embryonic fertilization. It happens to every human; we all carry at least one cell with a mosaic mutation.
When the team looked at the mosaic data, the missing hotspot reappeared – in exactly the same place as the hotspot in the ERV analysis.
Early embryonic mutations were clustering at transcription start sites, but because mosaic mutations are patchy, they can look like sequencing noise, and many DNM study pipelines automatically exclude them.
"There is a blind spot in these studies," Weghorn says.
"To get around this, one could look at the co-occurrence patterns of mutations to help detect the presence of mosaic mutations. Or look at the data again and revisit discarded mutations that occur near the transcription starts of genes most strongly affected by the hotspot."
By looking at all three datasets together, the researchers were able to unravel the mechanisms behind the vulnerability in transcription start sites.
The beginning of a gene is a busy, fragile, and complex place where the RNA polymerase frequently pauses to briefly unravel the DNA. This machinery can misfire or leave the DNA exposed just a little too long, leading to damage that scars rather than heals cleanly.
It's a missing piece of the puzzle about where DNA mutations come from – and one that may improve studies of genetic conditions relying on de novo mutation data.
The research has been published in Nature Communications.
