main article image
(GitHub)

A Staggering 21TB of Source Code Were Just Buried in The Arctic For an Unknown Future

22 JULY 2020

If doomsday comes, know this: precautions have been taken. On an isolated Arctic archipelago, the Svalbard Global Seed Vault – aka Norway's 'Doomsday Vault' – holds over 1 million seed samples in a fortress-like bunker designed to be the most invulnerable seed bank in the world.

 

Svalbard protects more than just seeds, though. On the same remote mountain, an abandoned coal mine now exists as another vital safe-house: the Arctic World Archive, preserving the world's data of today for an uncertain tomorrow. And the facility just received a contribution that's truly mind-boggling in scope.

GitHub, often billed as the world's largest host of open source code, has successfully transported all of its active public code repositories (as of February this year) to the Arctic World Archive, as part of the company's ongoing efforts to establish the GitHub Arctic Code Vault.

010 github vault 3(GitHub)

"Our mission is to preserve open source software for future generations by storing your code in an archive built to last a thousand years," Julia Metcalf, GitHub's director of strategic programs, explains on the company's blog.

The project, first announced last year, already saw one shipment to Svalbard in late 2019, with a deposit of 6,000 of the platform's most significant repositories of open source code.

The new shipment, painstakingly managed during the shutdowns and border closures of the coronavirus pandemic, goes even further, preserving a massive haul amounting to 21 terabytes of data, written onto 186 reels of a digital archival film called piqlFilm.

 

This purpose-built media – designed to last for 500 years, with simulations suggesting it should last twice as long – is now stored 250 metres deep, in a steel-walled container inside a sealed chamber in the Arctic World Archive.

The film, composed of silver halides on polyester, looks like a miniaturised print of QR codes, except every frame squeezes in some 8.8 million microscopic pixels, and each reel runs for almost 1 kilometre (about 3,500 ft), such is the gargantuan size of the data being stored.

010 github vault 3(GitHub/YouTube)

"It can withstand extreme electromagnetic exposure and has undergone extensive longevity and accessibility testing," the piql company claims.

It's hoped that this extremely long-life media – in conjunction with the Archive's natural isolation and engineered security – will give the world's open source software the best chance of seeing a distant future where it may one day be needed by upcoming generations.

"It is easy to envision a future in which today's software is seen as a quaint and long-forgotten irrelevancy, until an unexpected need for it arises," the GitHub Archive website explains.

"Like any backup, the GitHub Archive Program is also intended for currently unforeseeable futures as well."

In those unforeseeable futures, it's hard to know exactly what future humans will make of the archive's coded contents, or how they might be able to access and use them.

For that reason, the vault will also contain a separate, human-readable reel, called the Tech Tree, explaining the technical history and cultural context of the archive's contents.

The Tech Tree won't just throw future humans into the world of 21st century open source code, but serve as a primer for what these programs are, and what kind of technology they run on.

"It will also include works which explain the many layers of technical foundations that make software possible: microprocessors, networking, electronics, semiconductors, and even pre-industrial technologies," Metcalf explains.

"This will allow the archive's inheritors to better understand today's world and its technologies, and may even help them recreate computers to use the archived software."