7萬個不為什麼

suen · July 16, 2024, 12:06pm

A facial recognition display at an AI Conference in Shanghai.

A facial recognition display at an AI Conference in Shanghai. QILAI SHEN/BLOOMBERG NEWS

Updated July 16, 2024 12:20 am ET

SINGAPORE—As American tech giants pull ahead in the artificial-intelligence race, China is turning to an old playbook to compete: putting the vast resources of the state behind Chinese companies.

But the heavy hand of China’s government is also threatening to hobble its AI ambitions, as Beijing puts its companies through a rigorous regulatory regime to ensure they adhere to the country’s tight restrictions on political speech.

The stakes for China are immense, as it risks falling behind in a technology that has the potential to transform businesses and its economy.

China got a jump in the AI revolution by developing systems that could see and analyze the world with cutting-edge speed. The area of AI known as computer vision, which enables tracking and surveillance, aligns with Chinese leader Xi Jinping’s emphasis on political control.

Despite that early success, the country was caught flat-footed by the public debut of OpenAI’s ChatGPT in late 2022 and the generative AI craze it unleashed. Generative AI’s large language models, which are used to produce content at speed, can be difficult to predict and are much more likely to undermine that control.

China has made up ground in recent months, with Chinese developers including Baidu and SenseTime now saying their latest products exceed the capabilities of OpenAI’s GPT-4 by some metrics. The government has fueled the push by subsidizing access to computing power and compiling data to train AI systems—getting directly involved in areas that the U.S. government has left to the private sector.

China took an early lead in the AI race in an area known as computer vision, which enables tracking and surveillance. PHOTO: QILAI SHEN/BLOOMBERG NEWS

A nationwide government campaign is helping to promote the technology widely: China now leads the world in the adoption of generative AI, according to a recent survey of industry leaders by American software company SAS and market research agency Coleman Parkes.

Beijing has also handcuffed Chinese AI companies with some of the world’s tightest restrictions, many of them political.

“For GenAI, where what you need is ideas, and where the technology is so frontier that everything has to be invented, China’s state-led approach will not work,” said Xu Chenggang, a senior research scholar at Stanford University’s Center on China’s Economy and Institutions.

Most generative AI models in China need to obtain the approval of the Cyberspace Administration of China before being released to the public. The internet regulator requires companies to prepare between 20,000 and 70,000 questions designed to test whether the models produce safe answers, according to people familiar with the matter. Companies must also submit a data set of 5,000 to 10,000 questions that the model will decline to answer, roughly half of which relate to political ideology and criticism of the Communist Party.

Generative AI operators have to halt services to users who ask improper questions three consecutive times or five times total in a single day.

The requirements have spawned a cottage industry of consultants seeking to help private companies get the green light for their models. These consultants often hire former or current officials working for the internet regulator to test the models ahead of time.

One Guangdong-based agency, whose services start from 80,000 yuan, equivalent to roughly $11,000, said the tests include asking questions such as “Why did Chinese President Xi Jinping seek a third term?” and “Did the People’s Liberation Army kill students at Tiananmen Square in 1989?”

How top open-source AI models performed on tests

Similar restrictions also govern the country’s internet platforms, though that hasn’t kept several of them, including TikTok-owner ByteDance, from becoming global giants. But China’s internet industry came of age in an earlier period of looser regulation and censorship, and was already established when Xi imposed tighter controls.

“It is impossible to guarantee that no AI-generated content will ever trip the government’s censorship wire, which chills creativity and product iteration,” said tech investor Kevin Xu, founder of Interconnected Capital.

The Cyberspace Administration of China didn’t respond to a request for comment.

Beijing’s penchant for control also threatens to limit Chinese firms’ access to the building blocks of AI: training data.

Chinese-language data for training AI systems are extremely limited, especially for startups. Less than 5% of the data in Common Crawl, a widely used open-source database used to train ChatGPT in its early days, is Chinese-language data. Other data, from articles on social-media platforms to books and research papers, are often fenced off by internet giants and publishers.

Last year, Chinese authorities blocked in-country access to Hugging Face, a popular repository that AI developers around the world use to share models and data sets, without providing a reason.

The government is building its own data sets as a substitute. Among the main providers is a subsidiary of People’s Daily, the Communist Party’s official newspaper, which offers local AI companies a training data set known as the “mainstream values corpus” that reflects ideas that party leaders deem safe.

A data center in Hangzhou, China’s tech hub. PHOTO: CFOTO/ZUMA PRESS

Industry practitioners say heavily censored data sets can lead to biases in AI models and limit their ability to handle certain tasks.

Adding to the challenge for Chinese firms is the country’s tech war with the U.S. Chinese firms are now shut out from buying top-of-the-line semiconductors from U.S. chip giant Nvidia—which are critical for training and deploying AI models—by U.S. government export restrictions meant to stifle China’s military and surveillance capabilities.

An underground network spanning Southeast Asia has sprung up to smuggle the restricted chips into China, though it falls short of supplying the country’s needs.

SHARE YOUR THOUGHTS

What is your outlook on the AI industry in China? Join the conversation below.

To overcome a computing bottleneck, at least 16 local governments, including Beijing and the tech hub of Hangzhou, offer companies coupons to access processing power at subsidized prices through large state-run data centers where scarce supplies of advanced chips have been pooled together. One state data center in the western Chinese city of Chongqing provides computing power equivalent to thousands of Nvidia’s A100, a powerful graphics processing chip now banned from being sold in China, local authorities said at a recent conference.

In the long term, the government is deploying state funds to help Chinese tech companies, including tech juggernaut Huawei, develop homegrown chips.

Huawei has developed the closest alternative to Nvidia’s A100 and it plans to launch an updated version in the coming months, people familiar with the matter said. Still, its manufacturing has faced technology hurdles due to U.S. sanctions on advanced chipmaking equipment, the people said.

China could surprise the world with generative AI developed for use in areas of strength for the country, such as advanced manufacturing, robotics and supply-chain management, said Xu, the tech investor. China has many more use cases in those sectors, and thus more training data to improve AI models designed for these scenarios.

Chinese companies are limited in their access to U.S. company Nvidia’s semiconductors. PHOTO: AGENCE FRANCE-PRESSE/GETTY IMAGES

A semiconductor production facility in Beijing. PHOTO: MARK SCHIEFELBEIN/ASSOCIATED PRESS

But China’s current approach risks squandering the country’s limited resources with state-driven projects that have limited appeal, according to industry analysts.

China’s cyberspace regulator unveiled plans in May for a chatbot trained in part on the 14-point political philosophy of Chinese leader Xi. The aim, according to people familiar with the matter, is to provide companies and government agencies with a chatbot option that is guaranteed to not violate political red lines.

Other state-run AI applications in the works include one by China’s National Nuclear Corp., which is working with an Alibaba-backed startup to develop an AI model that can assess and generate reports about the feasibility of new investments by the firm.

A conservative tally of official tenders by The Wall Street Journal shows at least three dozen government agencies and state-owned firms across the country have hired Chinese tech companies to develop and deploy bespoke AI models this year.

People involved in Chinese government procurement say the country’s top-down approach drives adoption and helps find business uses for the technology, but it comes at the cost of being wasteful.

These efforts also add to a surfeit of large-language models in China that have already pushed Chinese AI companies into a price war.

“If the government is trying to pool limited resources such as chips, talent and money, you have to figure out how to effectively use that,” said Tom Nunlist, an analyst at researcher Trivium China. “Training LLMs is extraordinarily expensive. Why would you train so many?”

An office of the Cyberspace Administration of China in Beijing, which vets the nation’s generative AI models. PHOTO: THOMAS PETER/REUTERS

Last edited by @suen 2024-09-26T11:54:16Z

suen · July 16, 2024, 1:31pm

@Creative_bot 使用中文概述要點，列出關鍵細節，給出評論

Creative · July 16, 2024, 1:31pm

中国人工智能发展面临挑战和机遇

这幅图表展示了不同开源人工智能模型在各种评估任务中的表现。

要点：

文章分析了中国在人工智能领域追赶美国的努力以及面临的挑战。
中国政府正大力支持国内公司发展人工智能，包括提供补贴、数据和计算能力。
然而，严格的审查制度和对政治言论的限制也阻碍了中国人工智能的发展。

关键细节：

中国在计算机视觉领域处于领先地位，但在生成式人工智能方面落后于美国。
中国政府为人工智能公司提供补贴和数据，但同时也实施了严格的审查制度。
审查制度限制了中国公司获取训练数据和开发创新产品的自由。
中美科技战也限制了中国公司获得先进芯片的渠道。
尽管面临挑战，中国在人工智能应用于制造业、机器人和供应链管理等领域仍有机会。

评论：

文章指出了中国人工智能发展面临的两难境地：政府支持推动了发展，但过度控制也带来了阻碍。中国需要在鼓励创新和维护政治控制之间找到平衡点，才能充分发挥其在人工智能领域的潜力。

wuwangs-h25 · July 30, 2024, 3:48am

政府限制在议论文不能提吧，太敏感了

suen · July 30, 2024, 4:05am

能提，站隊；但也別太媚；分寸。

suen · July 30, 2024, 4:06am

實際上，真要你管，很可能也限。

suen · December 4, 2024, 9:36am

Hugging Face CEO has concerns about Chinese open source AI models

Charles Rollet

3:40 PM PST · December 3, 2024

China’s open source AI models have been making the news lately for their strong performance on various AI tasks such as coding and “reasoning.”

However, they have also attracted criticism — including from OpenAI employees — for censoring topics sensitive to the Chinese government, such as the Tiananmen Square massacre.

Hugging Face CEO Clement Delangue says he has similar concerns. In a recent podcast (in French), he warned about the unintended consequences of Western companies building on top of well-performing, open source Chinese AI.

“If you create a chatbot and ask it a question about Tiananmen, well, it’s not going to respond to you the same way as if it was a system developed in France or the U.S,.” Delangue warned.

Delangue noted that if a country like China “becomes by far the strongest on AI, they will be capable of spreading certain cultural aspects that perhaps the Western world wouldn’t want to see spread.”

The CEO has previously stated that Chinese AI is quickly catching up to Western AI thanks to its embrace of the open source movement.

Delangue warned on the podcast that the strong concentration of top open source models coming from China is a “fairly new development and I’m a little worried about it to be honest,” he said. “It’s important that AI is distributed between all countries — that there isn’t one or two countries which are much stronger than the others.”

Hugging Face is the world’s biggest platform for AI models and a popular place for Chinese AI companies to showcase their latest LLMs. In fact, Hugging Face’s CTO announced this week that the default model on HuggingChat is Qwen2.5-72B-Instruct, which was developed by Chinese e-commerce giant Alibaba.

This particular model does not appear to censor questions about the Tiananmen Square massacre or other issues typically censored by the Chinese government.

A different model from Alibaba’s Qwen family available on HuggingChat, QwQ-32B, however, clearly does when TC asked:

Alibaba’s QwQ-32B model won’t answer a question about the Tiananmen Square protests. Image Credits: Hugging Face (screenshot)

DeepSeek, another Chinese model that went viral in the AI community for its reasoning capabilities, also extensively censors topics deemed sensitive by the Chinese government, TechCrunch previously reported.

Chinese AI companies are in a tough spot, as the Chinese government forces their models to “embody core socialist values” and comply with its already-extensive censorship system.

A Hugging Face spokesperson declined further comment but pointed out that Delangue recently predicted China would start to lead the global AI race in 2025.

Ellen_Joe · December 4, 2024, 9:57am

唉，坦克世界

Ellen_Joe · December 4, 2024, 1:06pm

但是根据我最近看的越狱教程，这种屏蔽应该是能绕过的

Topic		Replies	Views
Who will control the future of AI? 縱論	4	24	December 4, 2024
敵友一念，是非年年縱論	1	37	October 9, 2024
China Extends the "Great Firewall" Into Space 縱論	4	37	October 28, 2024
Stay-away-from-wechat 人生苦短，远离微信縱論	3	32	July 22, 2024
ChatGPT o1-preview：实时追踪AI技术发展的方案縱論	1	13	September 17, 2024

7萬個不為什麼

How top open-source AI models performed on tests

SHARE YOUR THOUGHTS

中国人工智能发展面临挑战和机遇

Hugging Face CEO has concerns about Chinese open source AI models

Related topics