A chatbot developed by Chinese AI startup DeepSeek has ascended to the top of Apple’s App Store charts in the US, dethroning OpenAI’s ChatGPT. This surge in popularity follows the release of DeepSeek’s R1 reasoning model, which the company claims rivals OpenAI’s o1 in performance while requiring significantly fewer resources.
Read: Wanatu: A new e-hailing service shakes up the market in South Africa
DeepSeek’s V3 LLM, the foundation for R1, is reportedly on par with GPT-4o and Anthropic’s Claude 3.5 Sonnet, yet it was developed for a fraction of the cost – under $6 million compared to GPT-4’s estimated $100 million. Furthermore, DeepSeek claims to have trained V3 using only 2,000 specialized chips, a stark contrast to the 16,000 or more chips typically required for leading models.
These claims, while unverified, have sent shockwaves through the tech industry. Nvidia’s stock plummeted over 12% in pre-market trading, mirroring declines across the tech sector, including Microsoft, Google, and other companies heavily invested in AI infrastructure.
DeepSeek’s success raises questions about the massive investments made by leading AI companies in supercomputing. The Stargate Project, a joint venture by Nvidia and other tech giants, is estimated to cost $500 billion, with $100 billion earmarked for Nvidia alone.
DeepSeek, founded by Liang Wenfeng, has reportedly circumvented US export restrictions by stockpiling Nvidia A100 chips and combining them with less powerful, readily available chips. This constrained hardware environment has seemingly forced the company to innovate in model efficiency, leading to lower training costs and potentially greater scalability.
While DeepSeek’s claims are subject to scrutiny, the company’s rapid ascent highlights the potential for alternative approaches to AI development. However, the intense interest in DeepSeek has also attracted malicious actors, prompting the company to temporarily limit new registrations.