Ant Group, backed by Jack Ma, develops cost-effective AI

Ant Group, backed by Jack Ma, has reportedly developed AI training techniques using Chinese-made chips, aiming to reduce costs by 20%. Sources indicate Ant utilized domestic chips from Alibaba and Huawei, among others, to train models using the Mixture of Experts (MoE) machine learning approach. These models achieved performance comparable to Nvidia’s H800 chips. While Ant still uses Nvidia for AI development, it now primarily relies on alternatives from AMD and Chinese manufacturers for its latest models.

Read: Bandwidth Blog & Smile 90.4FM Tech Tuesday: AI podcasts are coming!

This development places Ant in competition with both Chinese and US companies, particularly following DeepSeek’s demonstration of cost-effective model training. Ant’s efforts highlight the Chinese tech industry’s drive to utilize domestic alternatives to advanced Nvidia semiconductors, especially those impacted by US export restrictions.

A recent Ant research paper claims its models outperformed Meta Platforms in specific benchmarks, potentially advancing Chinese AI by lowering inference costs. The MoE technique, which divides tasks into smaller, specialized data sets, is central to Ant’s approach. While typically reliant on high-performance GPUs like Nvidia’s, Ant has focused on training LLMs more efficiently, aiming to scale models “without premium GPUs.”

This approach contrasts with Nvidia’s focus on developing increasingly powerful GPUs. Ant claims its optimized method reduced training costs for one trillion tokens from approximately C¥6.35 million to C¥5.1 million.

Ant plans to deploy its Ling-Plus and Ling-Lite language models in industrial AI applications, including healthcare and finance, following its acquisition of Haodf.com. The company also offers AI-powered services like Zhixiaobao and Maxiaocai.

In English-language benchmarks, Ant reported Ling-Lite outperformed a Meta Llama model, while both Ling-Lite and Ling-Plus surpassed DeepSeek’s models in Chinese-language benchmarks. Ling-Lite features 16.8 billion parameters, while Ling-Plus has 290 billion. For reference, GPT-4.5 is estimated to have 1.8 trillion parameters, and DeepSeek-R1 has 671 billion.

Ant acknowledged challenges during training, including stability issues related to hardware or model structure changes, which resulted in increased error rates. The company has made the Ling models open source.