Digest Qwen – China’s most popular Opensource LLM – Unfold Alibaba’s AI Playbook

Despite chip bans from the U.S., China’s AI ecosystem has exceeded expectations, especially garnering international attention from developers as Alibaba’s open-source Qwen series has been widely adopted and discussed in the community

China has created a completely separate AI ecosystem for various reasons: 1) to lessen dependency on the West and 2) the Great Firewall censorship constraints; but that is not to say that innovation is stifled.

There is an extremely vibrant set of players in China right now across the AI ecosystem, and today we will dive deep into the role Alibaba plays in the space.

Company	Infrastructure Layer	Model Layer	Application Layer
Alibaba	Alibaba Cloud offers a robust cloud infrastructure with support for open-source models and extensive AI services.	Qwen-72B and Qwen-1.8B are advanced LLMs developed by Alibaba Cloud, with capabilities in multimodal processing.	Dingtalk, enterprise chat platform Alimama, AI-driven ad-optimization tool set for SMEs selling on Tmall and Taobao.
Tencent	Tencent enhances its AI capabilities through its Intelligent High-Performance Network, optimizing GPU usage for LLM training.	Hunyuan is Tencent’s in-house LLM aimed at enterprise applications, with a focus on efficiency and cost-effectiveness.	Tencent’s AI services include personalized news feeds and chatbot solutions across its existing apps.
Huawei	Huawei Cloud provides a high-performance infrastructure tailored for AI applications, focusing on self-reliance in technology.	Pangu 3.0 consists of foundational, industry-specific, and scenario-specific models designed for diverse applications across sectors.	Huawei’s LLMs are used in various industries such as finance and healthcare to enhance digital transformation efforts.
ByteDance	ByteDance leverages hub its cloud infrastructure to support the deployment of its LLMs, emphasizing cost efficiency in AI services.	Doubao is a family of LLMs launched by ByteDance, designed for various applications with aggressive pricing strategies.	Applications like Doubao Chatbot and other generative AI tools are aimed at enhancing user interaction and content generation.
Baidu	Baidu Cloud provides a comprehensive infrastructure for AI model training and deployment, focusing on technological advancements in AI.	Ernie is Baidu’s flagship LLM which has seen significant improvements in training efficiency and application performance over time.	Baidu’s applications utilize Ernie for enhanced search capabilities, conversational agents, and other AI-driven solutions.

Joe Tsai speaks about Alibaba’s AI strategy, focusing on supporting the AI infrastructure by leveraging its existing cloud business.

Alibaba’s AI Playbook

Alibaba is invested in AI in five major ways with a twin strategy.

End-to-end tech stack strategy:

Building proprietary LLM – Qwen and offering its LLMs to AI builders
Cloud computing service
Designing chips catered for processing AI applications.

Ecosystem Strategy:

Implementing AI into its existing consumer-facing applications
Funding AI companies across the ecosystem

Alibaba is easily the most well-known Chinese tech company internationally with a leading cloud business and its own proprietary LLM technology. Although in China, Baidu and Huawei each have their own models and cloud service, Baidu’s data-focused strategy has always been more focused on its autonomous driving technology and Huawei has always been more focused on compute and hardware, whereas its LLM is seen to be more a “nice to have” add on for enterprise clients.

In contrast, Alibaba has repeatedly said that it aims to “make AI accessible to all.” At the 2024 Apsara Conference, Alibaba CEO Eddie Wu emphasized that the company is committed to supporting the open-source ecosystem from chips, servers, and networks, to storage and data centers.

Proprietary LLM: Tongyi Qianwen (Qwen)

At the forefront of Alibaba’s AI offerings is Tongyi Qianwen, a large language model akin to a “super chatbot.” This advanced model is capable of understanding and generating text, making it suitable for a wide range of applications, including article generation, conversational responses, and customer support.

The Qwen series – has incredible scale, performance across benchmarks, multimodal features, and commitment to accessibility for a wide range of users. Alibaba has made this technology publicly available, allowing other businesses to utilize it for free to enhance their customer service capabilities.

“It is the most competitive Chinese LLM when compared to the likes of GPT4/4.o in terms of its overall performance,” said Leo Jiang, founder of GroundAI and former Huawei Chief Digital Officer.

He added, that what makes Qwen special is because of its two formats, “its API-driven LLM service offers quicker time to market, and cost-effectiveness. Whereas its open-source version gives more control and privacy to its clients.”

Alibaba launched its large language development tool Tongyi Qianwen in 2023 and it is often referred to as Qwen and it is now at its 2.5 iteration. The Qwen models, including the Qwen-72B and Qwen-1.8B, are notable for their diverse parameter sizes—ranging from 1.8 billion to 72 billion parameters—and their multimodal capabilities, which allow them to process not just text but also audio and visual data.

This flexibility is enhanced by their training on over 3 trillion tokens, enabling them to outperform many other open-source models across various benchmarks, including multitask accuracy and code generation capabilities.

Qwen has positioned itself as an all-around AI assistant, with five key application use cases:

1) real-time meeting transcription and summaries

2) processing lengthy content and providing summaries that require complicated comprehension

3) AI PowerPoint presentation creation

4) real-time simultaneous translation

5) video chat with an AI agent that can provide problem-solving.

Source: Alibaba

The uniqueness of Qwen lies in its impressive technology and strong commitment to open-source principles, as Alibaba makes various versions of its models available on platforms like Hugging Face and ModelScope. Some people have been puzzled as to why the company chose to open up its model to others as it has been pouring capital into AI and now just giving its prize out for free. However, the company has been adamant about making it accessible to all as it emphasized that this approach fosters a collaborative environment where developers can experiment and innovate together. Monetization can come later and Alibaba will surely find ways to do so, but as of now, it’s come out as a key player in democratizing access to advanced AI technologies for all.

Alibaba has been largely training its open-source AI models on publicly available data across its applications such as its e-commerce marketplace app Taobao, a huge competitive advantage given that the monthly active users are over 930 million. By opening up its proprietary models, it has raised a debate about whether open-source AI models – which are usually more transparent, and cost-effective – are actually more prone to abuse as well.

In particular, companies with fewer than 100 million monthly active users can use these models for free, promoting wider adoption across industries. By supporting the growth of the open-source community, Alibaba has aimed to empower users to effectively harness AI capabilities while reducing reliance on proprietary technologies.

ChinaAI’s Jeff Ding translated the well-circulated AItechtalk article on why Qwen is the world’s most popular open-source large model right now, which wrote that “per Hugging Face data, the Qwen series/bloodline of models has reached more than 50,000. That is, developers around the world have trained more than 50,000 derivative models based on the Qwen series base, second only to the Llama series of about 70,000. This data is the most convincing indicator for judging the ecosystem-level influence of a model.”

Impressively, the Qwen models have garnered significant interest from across sectors, including automotive, gaming, and scientific research last year. The models have been downloaded over 40 million times since their introduction. Additionally, the lightweight Qwen-1.8B model is designed for deployment on edge devices such as smartphones, making it an attractive option for applications requiring lower computational resources.

The most recent comprehensive upgrade of Qwen2.5 means a larger parameter scale, more powerful comprehension of photos and videos, a large-scale audio language model, and continued open-source models. Not only has it been improved drastically, but the cost of strong inference capabilities to support complex tasks has been reduced for both Qwen-Plus and Qwen-Turbo.

Looking ahead, CEO Eddie Wu noted that while AI development has progressed rapidly, AGI (Artificial General Intelligence) is still in its early stages. He emphasized the importance of collaboration and highlighted that the API inference cost for Tongyi Qianwen has dropped by 97% year-on-year, a key factor contributing to its growing popularity. In fact, this is verified by Leo, the former Huawei executive who noted that the Qwen models offer higher accuracy and factuality compared to most other models based in China. It can be customized for enterprise use cases that prioritize the accuracy of outputs and aim to minimize model hallucinations in addition, Qwen’s biggest edge right now is that it is providing developers with a powerful yet cost-effective alternative.

How to Best Utilize Qwen?

Qwen stands out as both a competitive and commercially viable large language model (LLM). Its widespread adoption in the open-source community ensures broader validation and support, while its deployment is backed by world-class infrastructure from Alibaba Cloud. These factors make Qwen a strong choice for enterprises. Below are the four key steps to guide your Qwen enterprise deployment.

Define Business Objectives and Use Cases: Focus on high-impact use cases, such as automating customer support, enhancing data analysis, or improving content generation.
Data Preparation and Infrastructure Setup: Assess and prepare the data required for training and fine-tuning the Qwen model. This includes cleaning, structuring, and ensuring the availability of relevant datasets, as 60–70% of the overall cost typically lies in this layer.
Pilot Project and Iterative Evaluation: Start with a small-scale pilot project, compare outcomes against predefined KPIs, and iterate quickly for improvements.
Scale Up and Integration: Fully integrate Qwen into your existing workflows to harness its full potential, while establishing a governance structure to monitor and optimize its performance.

Alibaba Cloud

AI and the cloud business are like the left hand and the right hand, said Joe Tsai in a podcast speaking with Norwegian hedge fund manager Nicolai Tangen. As mentioned earlier, anyone can use Alibaba’s LLM through APIs, or directly go to its open-source model. However, for any of them who want to deploy Qwen they would need cloud computing power and Alibaba Cloud is there to provide that.

In fact, currently, 80% of China’s technology companies and half of the country’s large model companies run on Alibaba Cloud. This scale is simply unmatchable. Joe reiterated that with its cloud service as the largest provider in APAC, Alibaba has a huge advantage in garnering data and trials for its Tongyi Qianwen. The positive cycle allows the two businesses across the AI layers to continuously feed into each other.

In addition, the company has created the largest open-source community called ModelScope which hosts many other open-source models on the marketplace and when developers use those open-source models, they will also need compute power, which has become a main driver for Alibaba’s cloud revenue.

By providing the cloud infrastructure to the startups, the tech giant is hoping to hedge its bets by allowing them to access the best consumer-facing application firsthand. Providing the cloud infrastructure would enable the company to access a diverse pool of data across domains and use cases which it could potentially leverage to finetune its own models if given permission. It would also mean talent acquisition and exposure to innovations in the field will be more accessible.

Alibaba’s AI Applications

So let’s take a look at the application front. Alibaba has integrated AI into its own operations extensively, utilizing it for product recommendations on its e-commerce platform, intelligent customer service, AI-empowered advertisement targeting, and AI-driven solutions in cloud services. Additionally, it is looking for ways to better use AI to enhance logistics efficiency and other use cases as well. Today, let’s just take a look at a few mature ones first.

The Artificial Intelligence Online Serving (AI OS) is a platform developed by the company’s search engineering team. AI OS integrates personalized search, recommendation, and advertising, supporting various business scenarios across Alibaba’s platforms, mostly focusing on marketplace apps such as Taobao. The technology originally focused on Taobao’s search capabilities has expanded to include deep learning technologies and various engines for search and recommendation.

Dingtalk is an enterprise chat software, similar to Slack. Across Dingtalk, all products have been AI-enabled with an embedded AI agent for enterprise and personal use which was launched at the beginning of 2024. The AI agent is a virtual robot that can examine data analytics and is equipped with memory, planning, and execution capabilities.

The format to interact with the agent is through a chatbot similar to ChatGPT. The company’s suggested use cases include using the robot as a salesperson, IT, HR administrative, financial, or procurement staff and it can help companies automate many of the repetitive tedious tasks within the management process.

Meanwhile, Alimama is a platform that helps brands with ad optimization on Alibaba’s e-commerce marketplace apps – Tmall/ Taobao. Alimama is a relatively unknown business unit of Alibaba but it was actually founded very early on in 2007. It is a digital marketing platform for businesses that are selling on the Taobao or Tmall platforms. The AI-empowered multi-media LMA was launched in April this year and has been fully applied to 2B applications now. The tools include AI sales agents capable of handling client inquiries and performing basic ad design tasks to enhance efficiency and quality. Additionally, Alimama offers sales analytics for budgeting and pricing, inventory management tools to boost ROI, and cost-effective text-to-image or video generation services for advertisements. The company claims to have served over 1 million merchants on the platform and significantly reduced advertising production costs.

Investing to Capture All Possibilities (Opportunities)

Alibaba has actively acquired and invested in several promising AI companies across the layers, particularly those specializing in AI chip development and LLM developers. These strategic moves are aimed at expanding Alibaba’s opportunities in the rapidly evolving AI landscape.

And in 2024 alone, Alibaba has led major funding rounds for multiple AI firms, including a $1 billion investment in Moonshot AI, which has seen its valuation soar to approximately $2.5 billion; a $691 million funding round for Baichuan, raising its valuation to around $2.8 billion; and a commitment of > $600 million to MiniMax, which is three out of four of the so-called, “tigers.”

Currently, the four most valuable AI startups in China have been nicknamed “The Four AI (small) Tigers”, while all of them have been founded within the last three to five years and already achieving monumental success with Moonshot to be valued at $3 billion, Minimax valued at $2 billion, Zhipu AI raising nearly $800 million and Baichuan said to be valued near $2 billion.

Alibaba’s Chips: T-Head

Last and often overlooked is Alibaba’s efforts in hardware development. News flash, Huawei is not the only Chinese big tech developing chip hardware.

Alibaba’s chip venture, T-Head, is making significant strides in the development of RISC-V architecture as part of China’s broader push for semiconductor self-sufficiency amid ongoing U.S. trade restrictions. T-Head has focused on creating high-performance chips that can support various applications, including artificial intelligence (AI), big data analysis, and online transactions.

One of T-Head’s notable products is the Zhenyue 510, a controller chip designed for enterprise solid-state drives (SSDs). Launched at Alibaba’s Apsara cloud computing conference, this chip promises to enhance performance in Alibaba Cloud’s data centers by providing a 30% reduction in latency for input and output operations compared to existing solutions. This innovation is critical as it allows Alibaba to optimize its cloud services and improve efficiency in handling large-scale data processing tasks.

As China continues to navigate restrictions on U.S. technology, T-Head’s focus on RISC-V represents a strategic move towards potential greater independence in chip design and manufacturing.

What we know is that Alibaba has taken a holistic approach to its AI strategy. It encompasses a comprehensive technology stack and has positioned itself as a key player in the ecosystem, which are all key foundations to further propel the Qwen models significantly. Built on a foundation of infrastructure-level scalability, down to the chip level, Qwen models are designed to support diverse applications across Alibaba’s extensive e-commerce, app, and investment ecosystem. This strategic focus not only enhances the models’ capabilities but also ensures their relevance and effectiveness in various enterprise-driven use cases that prioritize accuracy and minimize model hallucinations. It has successfully positioned itself as one of the most important players, if not, THE MOST IMPORTANT, in China’s AI ecosystem.

Sources: interviews, industry reports, expert insights, company announcements, investor relations material, transcripts from the Aspara Conference, and Alizila.

Links

Alibaba Cloud official link to Qwen
Qwen2.5-LLM Instructions, last updated September 2024
Github Qwen2.5: a series of large language models supporting a variety of parameter scales (from 0.5B to 72B), with improved capabilities in long-text generation, instruction following, and structured data understanding, and supports 29 languages. Its applications are suitable for code generation, text generation, and complex data processing. Qwen2.5 offers features such as quantization, inference, and local deployment, which are compatible with various computational frameworks, such as Hugging Face, ModelScope, and vLLM, among others.
Github Qwen-VL: is a large-scale visual language model that supports both image and text inputs and has multilingual conversation capabilities, especially excelling in Chinese and English image-text recognition. The model supports high-resolution image processing and fine-grained recognition, outperforming most open-source models.
Github Qwen-Audio: is capable of processing various audio inputs (such as human speech, natural sounds, music, etc.) and generating text outputs. This model is suitable for tasks such as audio recognition, audio description, scene classification, and emotion recognition.
Github Qwen2.5-Math: it supports the solution of mathematical problems in both Chinese and English and integrates Chain of Thought (CoT) and Tool-Integrated Reasoning (TIR).
Github Qwen2.5-Coder: the latest open-source programming model that supports a context window of 128K and covers 92 programming languages.

Author Bio

Grace writes about AI x Energy, AI x Geopolitics, AI x bigtech on Substack at AI Proem.

She also often writes commentaries for Fortune, The Diplomat, and other international publications on AI, tech, and corporate governance. In her past life as a journalist, Grace has reported for CNBC on Asia tech and business out of Singapore, and her work has also been published in the SCMP, S&P Global Market Intelligence, Yahoo Finance, and USA Today.

Digest Qwen – China’s most popular Opensource LLM – Unfold Alibaba’s AI Playbook