Data analytics and AI progress in China, plans to dominate the market by 2030

Bloomberg, 14 Aug 2017

Xu Li’s software scans more faces than maybe any on earth. He has the Chinese police to thank.

Xu runs SenseTime Group Ltd., which makes artificial intelligence software that recognizes objects and faces, and counts China’s biggest smartphone brands as customers. In July, SenseTime raised $410 million, a sum it said was the largest single round for an AI company to date. That feat may soon be topped, probably by another startup in China.

The nation is betting heavily on AI. Money is pouring in from China’s investors, big internet companies and its government, driven by a belief that the technology can remake entire sectors of the economy, as well as national security. A similar effort is underway in the U.S., but in this new global arms race, China has three advantages: A vast pool of engineers to write the software, a massive base of 751 million internet users to test it on, and most importantly staunch government support that includes handing over gobs of citizens’ data –- something that makes Western officials squirm.

Data is key because that’s how AI engineers train and test algorithms to adapt and learn new skills without human programmers intervening. SenseTime built its video analysis software using footage from the police force in Guangzhou, a southern city of 14 million. Most Chinese mega-cities have set up institutes for AI that include some data-sharing arrangements, according to Xu. “In China, the population is huge, so it’s much easier to collect the data for whatever use-scenarios you need,” he said. “When we talk about data resources, really the largest data source is the government.”

This flood of data will only rise. China just enshrined the pursuit of AI into a kind of national technology constitution. A state plan, issued in July, calls for the nation to become the leader in the industry by 2030. Five years from then, the government claims the AI industry will create 400 billion yuan ($59 billion) in economic activity. China’s tech titans, particularly Tencent Holdings Ltd. and Baidu Inc., are getting on board. And the science is showing up in unexpected places: Shanghai’s courts are testing an AI system that scours criminal cases to judge the validity of evidence used by all sides, ostensibly to prevent wrongful prosecutions.

“Data access has always been easier in China, but now people in government, organizations and companies have recognized the value of data,” said Jiebo Luo, a computer science professor at the University of Rochester who has researched China. “As long as they can find someone they trust, they are willing to share it.”

The AI-MATHS machine took the math portion of China’s annual university entrance exam in Chengdu.

Photographer: AFP via Getty Images

Every major U.S. tech company is investing deeply as well. Machine learning — a type of AI that lets driverless cars see, chatbots speak and machines parse scores of financial information — demands computers learn from raw data instead of hand-cranked programming. Getting access to that data is a permanent slog. China’s command-and-control economy, and its thinner privacy concerns, mean that country can dispense video footage, medical records, banking information and other wells of data almost whenever it pleases.

Xu argued this is a global phenomenon. “There’s a trend toward making data more public. For example, NHS and Google recently shared some medical image data,” he said. But that example does more to illustrate China’s edge.

DeepMind, the AI lab of Google’s Alphabet Inc., has labored for nearly two years to access medical records from the U.K.’s National Health Service for a diagnostics app. The agency began a trial with the company using 1.6 million patient records. Last month, the top U.K. privacy watchdog declared the trial violates British data-protection laws, throwing its future into question.

Go player Lee Se-Dol, right, in a match against Google’s AlphaGo, during the DeepMind Challenge Match in March 2016. Photographer: Google via Getty Images

Contrast that with how officials handled a project in Fuzhou. Government leaders from that southeastern Chinese city of more than seven million people held an event on June 26. Venture capital firm Sequoia Capital helped organize the event, which included representatives from Dell Inc., International Business Machines Corp. and Lenovo Group Ltd. A spokeswoman for Dell characterized the event as the nation’s first “Healthcare and Medical Big Data Ecology Summit.”

The summit involved a vast handover of data. At the press conference, city officials shared 80 exabytes worth of heart ultrasound videos, according to one company that participated. With the massive data set, some of the companies were tasked with building an AI tool that could identify heart disease, ideally at rates above medical experts. They were asked to turn it around by the fall.

“The Chinese AI market is moving fast because people are willing to take risks and adopt new technology more quickly in a fast-growing economy,” said Chris Nicholson, co-founder of Skymind Inc., one of the companies involved in the event. “AI needs big data, and Chinese regulators are now on the side of making data accessible to accelerate AI.”

Representatives from IBM and Lenovo declined to comment. Last month, Lenovo Chief Executive Officer Yang Yuanqing said he will invest $1 billion into AI research over the next three to four years.

Along with health, finance can be a lucrative business in China. In part, that’s because the country has far less stringent privacy regulations and concerns than the West. For decades the government has kept a secret file on nearly everyone in China called a dang’an. The records run the gamut from health reports and school marks to personality assessments and club records. This dossier can often decide a citizen’s future — whether they can score a promotion or be allowed to reside in the city they work.

U.S. companies that partner in China stress that AI efforts, like those in Fuzhou, are for non-military purposes. Luo, the computer science professor, said most national security research efforts are relegated to select university partners. However, one stated goal of the government’s national plan is for a greater integration of civilian, academic and military development of AI.

The government also revealed in 2015 that it was building a nationwide database that would score citizens on their trustworthiness, which in turn would feed into their credit ratings. Last year, China Premier Li Keqiang said 80 percent of the nation’s data was in public hands and would be opened to the public, with an unspecific pledge to protect privacy. The raging popularity of live video feeds — where Chinese internet users spend hours watching daily footage caught by surveillance video — shows the gulf in privacy concerns between the country and the West. Embraced in China, the security cameras also reel in mountains of valuable data.

Some machine-learning researchers dispel the idea that data can be a panacea. Advanced AI operations, like DeepMind, often rely on “simulated” data, co-founder Demis Hassabis explained during a trip to China in May. DeepMind has used Atari video games to train its systems. Engineers building self-driving car software frequently test it this way, simulating stretches of highway or crashes virtually.

“Sure, there might be data sets you could get access to in China that you couldn’t in the U.S.,” said Oren Etzioni, director of the Allen Institute for Artificial Intelligence. “But that does not put them in a terrific position vis-a-vis AI. It’s still a question of the algorithm, the insights and the research.”

Historically, the country has been a lightweight in those regards. It’s suffered through a “brain drain,” a flight of academics and specialists out of the country. “China currently has a talent shortage when it comes to top tier AI experts,” said Connie Chan, a partner at venture capital firm Andreessen Horowitz. “While there have been more deep learning papers published in China than the U.S. since 2016, those papers have not been as influential as those from the U.S. and U.K.”

But China is gaining ground. The country is producing more top engineers, who craft AI algorithms for U.S. companies and, increasingly, Chinese ones. Chinese universities and private firms are actively wooing AI researchers from across the globe. Juo, the University of Rochester professor, said top researchers can get offers of $500,000 or more in annual compensation from U.S. tech companies, while Chinese companies will often double that.

Meanwhile, China’s homegrown talent is starting to shine. A popular benchmark in AI research is the ImageNet competition, an annual challenge to devise a visual recognition system with the lowest error rate. Like last year, this year’s top winners were dominated by researchers from China, including a team from the Ministry of Public Security’s Third Research Institute.

Relentless pollution in metropolises like Beijing and Shanghai has hurt Chinese companies’ ability to nab top tech talent. In response, some are opening shop in Silicon Valley. Tencent recently set up an AI research lab in Seattle.

Qi Lu.  Photographer: David Paul Morris/Bloomberg

Baidu managed to pull a marquee name from that city. The firm recruited Qi Lu, one of Microsoft’s top executives, to return to China to lead the search giant’s push into AI. He touted the technology’s potential for enhancing China’s “national strength” and cited a figure that nearly half of the bountiful academic research on the subject globally has ethnically Chinese authors, using the Mandarin term “huaren” 华人– a term for ethnic Chinese that echoes government rhetoric.

“China has structural advantages, because China can acquire more and better data to power AI development,” Lu told the cheering crowd of Chinese developers. “We must have the chance to lead the world!”