A Shanghai police database with a vast trove of personal data that was seized by a hacker or group had been left online, unsecured, for months, security researchers said, in what is probably the largest known breach of Chinese government computer systems.
The leak, which came to light after an anonymous user posted in an online forum offering to sell the personal information of as many as one billion Chinese citizens, exposes the privacy risks of the Chinese government’s vast surveillance and security apparatus.
The authorities in China collect vast amounts of data on citizens by tracking their movements, scouring their social media posts, and recording their DNA and other biological markers. Yet even as the state amasses ever greater amounts of personal data, it has sometimes been lax in erecting safeguards, such as by parking it on unprotected servers. Shortly after the Shanghai database was advertised, another anonymous user posted in an online forum offering to sell a separate police database from the central Chinese province of Henan, claiming to have information on 90 million citizens.
Chinese citizens have in recent years expressed growing demands for personal privacy and data protection from companies. This leak, if it became widely known within China, would most likely fuel public resistance to the collection of private data by the government, as well. But news about the leak has been swiftly censored and removed from the Chinese internet and social media platforms, a sign that the government recognizes the explosive nature of the apparent breach. As of Thursday, hashtags such as “Shanghai data leak,” “data leak of one billion citizens” and “data leak” remained blocked on Sina Weibo, a popular Chinese microblogging service.
“It’s left a big black eye for the Chinese public security world, and by extension the Chinese government,” said Paul Triolo, senior vice president for China at Albright Stonebridge Group, a strategy firm. “It’s not surprising they’ve gone into full censorship mode given how sensitive this issue is for the public.”
While large data leaks are not uncommon, the Shanghai police database stands out both for its scale and the highly sensitive nature of some of the information included, security researchers said.
Two cybersecurity researchers said that they had separately verified the anonymous user’s claims that the database included over 23 terabytes of data covering as many as a billion individuals, noting that one of the leaked files appeared to contain nearly 970 million records. They did not rule out the possibility of duplicate entries.
One of them, Vinny Troia, founder of Shadowbyte, a threat intelligence company, said that he first stumbled across the database months ago. Data from Leak IX, an online platform that trawls the internet for exposed databases, shows that the server was accessible as early as April 2021. The revelation that the Shanghai database had long been unsecured was earlier reported by CNN.
The New York Times confirmed parts of a sample of 750,000 records that the anonymous user, who goes by the name ChinaDan, released to prove the authenticity of the data. In addition to addresses and ID numbers, the database also included information on “key persons” identified by the police as requiring heightened surveillance, as well as police reports. In one case, a grandfather was reported to the police for raping his 3-year-old granddaughter. In another, a person was investigated for petitioning on Tiananmen Square in Beijing. The sample also included the names and passport numbers of American citizens who violated the terms of their visas in China.
Nine people reached by The New York Times by telephone confirmed their names and details. None of the people contacted said they had previously heard about the data leak.
Some seemed unfazed about having their personal information exposed. One man, whose record of a complaint to the police that his daughter had been raped by her work manager was among the data posted in the sample set, confirmed the accuracy of the record when reached by phone. But he said that the episode was in the past, and it didn’t matter if the information was public.
Others expressed frustration and resignation. Many Chinese have grown accustomed to surveillance, censorship and frequent telemarketing calls, accepting that such intrusions were the cost of convenience and safety. Still, they said, there needed to be safeguards.
“It’s alarming because these are the files of ordinary people,” said May Peng, a saleswoman in Shanghai whose details were also in the sample set. She confirmed that as the data showed, she had filed a police report in 2017 when her electric scooter was stolen. “They should be better protected.”
The government has kept silent on the matter. The Cybersecurity Administration of China did not respond to a faxed request for comment. Shanghai’s public security bureau declined to respond to questions about the database.
The government’s refusal to acknowledge the leak comes in contrast to common practice in other countries, under which companies and government agencies are often obligated to alert affected users if their information has been leaked.
Mr. Troia and another researcher, Bob Diachenko, owner of SecurityDiscovery.com, a cybersecurity consultancy, said that the Shanghai data had been stored securely on a closed-off network until someone set up a gateway that essentially punched a hole through the firewall. They said that creating such portals was common practice among developers as a way to gain easy access to a database, but that such gateways should be password protected.
The gateway to the Shanghai database did not have a password.
Mr. Troia said he first came across the unsecured trove of files last December or January, and that it stood out for its vast size. He said he downloaded and reviewed a small sample of the files at the time.
Mr. Diachenko said that his team had determined that the database was accessible as early as April this year until mid-June when someone copied and destroyed the data and left a ransom note demanding 10 Bitcoin, current value about $200,000, for recovery of the information. Security researchers say that it is common for malicious actors to hijack exposed databases and try to extort the data owners with ransom demands.
It’s unclear if anyone has paid for and downloaded the entire database. The Times reached out to the anonymous user this week but did not receive a response.
Security researchers say that the vast amount of personal information contained in the Shanghai database could put the individuals whose data was exposed at risk of extortion, blackmail or fraud.
“The more complete profile you have of a person, the more dangerous it is,” Mr. Diachenko said. “The possibilities are endless.”