A facial recognition display at an AI Conference in Shanghai. QILAI SHEN/BLOOMBERG NEWS
By Liza Lin
Updated July 16, 2024 12:20 am ET
SINGAPORE—As American tech giants pull ahead in the artificial-intelligence race, China is turning to an old playbook to compete: putting the vast resources of the state behind Chinese companies.
But the heavy hand of China’s government is also threatening to hobble its AI ambitions, as Beijing puts its companies through a rigorous regulatory regime to ensure they adhere to the country’s tight restrictions on political speech.
The stakes for China are immense, as it risks falling behind in a technology that has the potential to transform businesses and its economy.
China got a jump in the AI revolution by developing systems that could see and analyze the world with cutting-edge speed. The area of AI known as computer vision, which enables tracking and surveillance, aligns with Chinese leader Xi Jinping’s emphasis on political control.
Despite that early success, the country was caught flat-footed by the public debut of OpenAI’s ChatGPT in late 2022 and the generative AI craze it unleashed. Generative AI’s large language models, which are used to produce content at speed, can be difficult to predict and are much more likely to undermine that control.
China has made up ground in recent months, with Chinese developers including Baidu and SenseTime now saying their latest products exceed the capabilities of OpenAI’s GPT-4 by some metrics. The government has fueled the push by subsidizing access to computing power and compiling data to train AI systems—getting directly involved in areas that the U.S. government has left to the private sector.
China took an early lead in the AI race in an area known as computer vision, which enables tracking and surveillance. PHOTO: QILAI SHEN/BLOOMBERG NEWS
A nationwide government campaign is helping to promote the technology widely: China now leads the world in the adoption of generative AI, according to a recent survey of industry leaders by American software company SAS and market research agency Coleman Parkes.
Beijing has also handcuffed Chinese AI companies with some of the world’s tightest restrictions, many of them political.
“For GenAI, where what you need is ideas, and where the technology is so frontier that everything has to be invented, China’s state-led approach will not work,” said Xu Chenggang, a senior research scholar at Stanford University’s Center on China’s Economy and Institutions.
Most generative AI models in China need to obtain the approval of the Cyberspace Administration of China before being released to the public. The internet regulator requires companies to prepare between 20,000 and 70,000 questions designed to test whether the models produce safe answers, according to people familiar with the matter. Companies must also submit a data set of 5,000 to 10,000 questions that the model will decline to answer, roughly half of which relate to political ideology and criticism of the Communist Party.
Generative AI operators have to halt services to users who ask improper questions three consecutive times or five times total in a single day.
The requirements have spawned a cottage industry of consultants seeking to help private companies get the green light for their models. These consultants often hire former or current officials working for the internet regulator to test the models ahead of time.
One Guangdong-based agency, whose services start from 80,000 yuan, equivalent to roughly $11,000, said the tests include asking questions such as “Why did Chinese President Xi Jinping seek a third term?” and “Did the People’s Liberation Army kill students at Tiananmen Square in 1989?”
How top open-source AI models performed on tests
Similar restrictions also govern the country’s internet platforms, though that hasn’t kept several of them, including TikTok-owner ByteDance, from becoming global giants. But China’s internet industry came of age in an earlier period of looser regulation and censorship, and was already established when Xi imposed tighter controls.
“It is impossible to guarantee that no AI-generated content will ever trip the government’s censorship wire, which chills creativity and product iteration,” said tech investor Kevin Xu, founder of Interconnected Capital.
The Cyberspace Administration of China didn’t respond to a request for comment.
Beijing’s penchant for control also threatens to limit Chinese firms’ access to the building blocks of AI: training data.
Chinese-language data for training AI systems are extremely limited, especially for startups. Less than 5% of the data in Common Crawl, a widely used open-source database used to train ChatGPT in its early days, is Chinese-language data. Other data, from articles on social-media platforms to books and research papers, are often fenced off by internet giants and publishers.
Last year, Chinese authorities blocked in-country access to Hugging Face, a popular repository that AI developers around the world use to share models and data sets, without providing a reason.
The government is building its own data sets as a substitute. Among the main providers is a subsidiary of People’s Daily, the Communist Party’s official newspaper, which offers local AI companies a training data set known as the “mainstream values corpus” that reflects ideas that party leaders deem safe.
A data center in Hangzhou, China’s tech hub. PHOTO: CFOTO/ZUMA PRESS
Industry practitioners say heavily censored data sets can lead to biases in AI models and limit their ability to handle certain tasks.
Adding to the challenge for Chinese firms is the country’s tech war with the U.S. Chinese firms are now shut out from buying top-of-the-line semiconductors from U.S. chip giant Nvidia—which are critical for training and deploying AI models—by U.S. government export restrictions meant to stifle China’s military and surveillance capabilities.
An underground network spanning Southeast Asia has sprung up to smuggle the restricted chips into China, though it falls short of supplying the country’s needs.
SHARE YOUR THOUGHTS
What is your outlook on the AI industry in China? Join the conversation below.
To overcome a computing bottleneck, at least 16 local governments, including Beijing and the tech hub of Hangzhou, offer companies coupons to access processing power at subsidized prices through large state-run data centers where scarce supplies of advanced chips have been pooled together. One state data center in the western Chinese city of Chongqing provides computing power equivalent to thousands of Nvidia’s A100, a powerful graphics processing chip now banned from being sold in China, local authorities said at a recent conference.
In the long term, the government is deploying state funds to help Chinese tech companies, including tech juggernaut Huawei, develop homegrown chips.
Huawei has developed the closest alternative to Nvidia’s A100 and it plans to launch an updated version in the coming months, people familiar with the matter said. Still, its manufacturing has faced technology hurdles due to U.S. sanctions on advanced chipmaking equipment, the people said.
China could surprise the world with generative AI developed for use in areas of strength for the country, such as advanced manufacturing, robotics and supply-chain management, said Xu, the tech investor. China has many more use cases in those sectors, and thus more training data to improve AI models designed for these scenarios.
Chinese companies are limited in their access to U.S. company Nvidia’s semiconductors. PHOTO: AGENCE FRANCE-PRESSE/GETTY IMAGES
A semiconductor production facility in Beijing. PHOTO: MARK SCHIEFELBEIN/ASSOCIATED PRESS
But China’s current approach risks squandering the country’s limited resources with state-driven projects that have limited appeal, according to industry analysts.
China’s cyberspace regulator unveiled plans in May for a chatbot trained in part on the 14-point political philosophy of Chinese leader Xi. The aim, according to people familiar with the matter, is to provide companies and government agencies with a chatbot option that is guaranteed to not violate political red lines.
Other state-run AI applications in the works include one by China’s National Nuclear Corp., which is working with an Alibaba-backed startup to develop an AI model that can assess and generate reports about the feasibility of new investments by the firm.
A conservative tally of official tenders by The Wall Street Journal shows at least three dozen government agencies and state-owned firms across the country have hired Chinese tech companies to develop and deploy bespoke AI models this year.
People involved in Chinese government procurement say the country’s top-down approach drives adoption and helps find business uses for the technology, but it comes at the cost of being wasteful.
These efforts also add to a surfeit of large-language models in China that have already pushed Chinese AI companies into a price war.
“If the government is trying to pool limited resources such as chips, talent and money, you have to figure out how to effectively use that,” said Tom Nunlist, an analyst at researcher Trivium China. “Training LLMs is extraordinarily expensive. Why would you train so many?”
An office of the Cyberspace Administration of China in Beijing, which vets the nation’s generative AI models. PHOTO: THOMAS PETER/REUTERS
Last edited by @suen 2024-09-26T11:54:16Z