After a two-day layoff following a cyberattack that temporarily disabled the digital library and Wayback Machine last week, the nonprofit Internet Archive has relaunched, in a ‘purely read-only’ state, as a hub for news about the cyber-sabotage and a repository for data that has been discovered thus far. A data breach on 9 October and subsequent distributed denial-of-service (DDoS) attack pushed the site offline that day, when a user authentication database was also reported stolen in recent weeks, containing as many as 31 million unique records.
The Internet Archive is up and running again in a ‘provisional, read-only state’, according to founder Brewster Kahle: ‘It appears to be safe to resume, but potentially if there is maintenance it will go on suspense again.
Even so, the Wayback Machine allows you to search 916 billion web pages that have been archived to this date. But you can’t presently capture any web page currently online to be archived. Kahle and team have returned Atul’s users only slowly. In the past few days, for instance, the team has returned the email that I have while I’m here (our Archive.org mail accounts), and the National Library crawlers. The top level services are offline, say staff: not to keep them offline so that something else gets done – ‘we couldn’t really turn the lights off’ – but to give ‘Internet Archive staff the ability to dig into and harden all services’.
Following a message that appeared on a pop‑up from a (possibly fake) hacker claiming that the archive had been subject to a ‘catastrophic security breach’, it became clear last week that Have I Been Pwned had been data-mined when the service confirmed that data had indeed been pilfered. The hack apparently exposed more than 31 million unique email accounts, as well as email addresses, screen names, hashed passwords and other internal data for those affected.
The outage at the Internet Archive took place just a few weeks after Google began adding links to versions of websites in the Wayback Machine to search results pages. Earlier this year, Google removed links to its own cache of pages but, thanks to the Wayback Machine, this is still a quick route to earlier versions of websites or archived pages.