A day ago
DeepSeek, a Chinese artificial intelligence (AI) startup, is the new revolutionary force in the global AI industry. Founded in May 2023 by Liang Wenfeng, a prominent figure in both the hedge fund and AI sectors, the company’s innovative approach to AI development has captured the world’s attention.
The startup’s cutting-edge large language models (LLMs), combined with its disruptive pricing strategy, have reshaped the AI landscape and drawn comparisons to leading players like OpenAI.
Despite being relatively new, DeepSeek’s journey began with the release of DeepSeek Coder in November 2023, a product that marked its entry into the competitive AI market.
DeepSeek Coder is an open-source model specifically designed for coding tasks. This release marked an important step in the company’s development, alongside the introduction of DeepSeek LLM later that month, which features 67 billion parameters.
In simple terms, parameters are the “brain cells” of a language model. The more parameters, the more complex and potentially powerful the model is.
To put this into perspective:
Yet, 67 billion parameters is a significant number, indicating that DeepSeek is a substantial and advanced language model.
DeepSeek-V2 was released in May 2024 and quickly disrupted the Chinese AI market due to its aggressively low pricing. As a result, major Chinese tech companies such as ByteDance, Tencent, Baidu, and Alibaba have been compelled to lower their pricing structures in response to DeepSeek’s strategy.
The pricing disruption initiated by DeepSeek has not only affected domestic players but also resonated internationally.
However, the real breakthrough came this January 2025 with the launch of DeepSeek-V3 and DeepSeek-R1, which rival ChatGPT in performance but operate at a fraction of the cost. They were trained using Nvidia’s lower-powered H800 chips for under $6 million. DeepSeek-R1, released as open-source on January 20, is based on DeepSeek-V3.
These models stand out for their ability to deliver high-quality results while using significantly less computational power, making them both cost-effective and efficient.
After that release, Sam Altman acknowledged that “DeepSeek’s R1 is an impressive model, particularly around what they’re able to deliver for the price.” However, Altman confidently asserts that his own team will continue to push the boundaries of innovation, driven by their research roadmap and commitment to harnessing the power of computing.
Altman predicts that “the world is going to want to use a LOT of AI, and really be quite amazed by the next gen models coming.”
Altman looks forward to “bringing you all AGI and beyond.”
Microsoft CEO Satya Nadella wrote on X that the DeepSeek phenomenon was just an example of the Jevons paradox, writing, “As AI gets more efficient and accessible, we will see its use skyrocket, turning it into a commodity we just can’t get enough of.”
Yann LeCun, Meta’s chief AI scientist, wrote on LinkedIn that DeepSeek’s success is indicative of changing tides in the AI sector to favor open-source technology.
LeCun wrote that DeepSeek has profited from some of Meta’s technology, i.e., its Llama models and that the startup “came up with new ideas and built them on top of other people’s work. Because their work is published and open source, everyone can profit from it. That is the power of open research and open source.”
DeepSeek’s unique training methodology sets it apart from traditional approaches. The company employs a trial-and-error process to enhance its models, mimicking human learning through feedback.
Its use of Mixture-of-Experts (MoE) architecture enables the models to activate only a fraction of their parameters at any given time. This reduces computational costs without compromising on performance, allowing DeepSeek’s models operate efficiently even on less powerful hardware.
As a result, DeepSeek has been able to challenge the notion that advanced AI requires massive investments in computing power and data infrastructure.
The versatility of DeepSeek’s models has contributed to their popularity:
DeepSeek-V3 is designed for general-purpose applications, meaning it can be applied to a wide range of tasks and industries, such as chatbots, language translation, text summarization, and more. Unlike models trained for specific tasks, like medical diagnosis or financial analysis, DeepSeek-V3 is designed to handle various tasks and domains.
DeepSeek-R1 is a specialized AI model that excels in tasks requiring advanced reasoning, which means it’s designed to:
Tasks that might require advanced reasoning include:
Both models are accessible through the company’s free chatbot, available on web and mobile platforms. Users can switch between the models depending on their needs, making the platform adaptable to a wide range of use cases.
However, the chatbot lacks some advanced features offered by competitors, such as AI-generated images and videos, as well as tools like Canvas and customized GPTs.
DeepSeek’s pricing strategy has been a game-changer. While its chatbot is free to use, the company charges just $0.55 per million input tokens and $2.19 per million output tokens for its API services.
In contrast, OpenAI’s API services cost $15 and $60, respectively, for similar capabilities. This significant cost advantage has made DeepSeek an attractive option for developers and businesses looking to integrate AI into their operations without incurring exorbitant costs.
The affordability and efficiency of DeepSeek’s models have sparked a reevaluation of the resources needed for AI development. The company’s use of Nvidia’s less advanced H800 chips for training has challenged the prevailing narrative that AI requires substantial financial and computational investments. This approach has led industry observers to question whether the massive expenditures by U.S. tech giants are justified or sustainable in the long term.
The release of DeepSeek’s R1 model has been described as AI’s “Sputnik moment” by Silicon Valley venture capitalist Marc Andreessen – a reference to the Soviet Union’s launch of the first satellite in 1957, which shocked the world and ignited the space race.
This analogy underscores the groundbreaking nature of the model and its potential to disrupt the global AI landscape.
Andreessen has praised DeepSeek-R1 as “one of the most amazing and impressive breakthroughs I’ve ever seen — and as open source, a profound gift to the world,” he said in a separate post, highlighting its open-source nature as a transformative gift to the world.
The model’s affordability and efficiency have also raised questions about the effectiveness of U.S. export controls aimed at limiting China’s access to advanced technology.
While DeepSeek’s rise has generated excitement, it has also raised concerns about data privacy and security.
As a Chinese company, DeepSeek’s storage of user data on servers in China has drawn scrutiny, particularly in light of geopolitical tensions between the U.S. and China. These concerns have been further amplified by recent cyberattacks and service outages, which temporarily limited user access to DeepSeek’s platform. Despite these challenges, the company’s popularity continues to grow, with its app becoming the top-rated free application on Apple’s App Store in the United States shortly after its release.
The launch of its free AI assistant triggered a significant sell-off in technology stocks, as investors worried about the implications for industry leaders like Nvidia, Microsoft, and Alphabet. As of January 27:
Nvidia, in particular, experienced a historic one-day loss of over 17%, reflecting concerns that DeepSeek’s cost-effective models could reduce the demand for high-performance chips and large-scale data centres.
Nvidia was on track to lose more than $600 billion in stock market value, the deepest-ever one-day loss for a company on Wall Street, according to LSEG data, and more than double the previous one-day record loss, set by Nvidia last September.
The Nasdaq’s next-biggest drag was chipmaker Broadcom Inc. down more than 18%, followed by ChatGPT backer Microsoft off 2.3%. Google parent Alphabet fell 3.4%.
The Philadelphia semiconductor index tumbled more than 10%, eying its biggest percentage drop since March 2020. U.S. equity declines followed a selloff that started in Asia, with Japan’s SoftBank Group finishing down 8.3% and moving through Europe where ASML fell 7%.
The disruptive pricing of DeepSeek’s models has also sparked discussions about a potential price war in the AI market. Analysts suggest that competitors like OpenAI may need to lower their prices to remain competitive, though this could prove challenging given their higher operational costs.
Some experts believe that U.S. companies will focus on trust and safety features to differentiate themselves, especially in enterprise markets. However, DeepSeek’s efficiency and affordability may still pose a significant challenge to established players.
For African audiences watching this post-Open AI revolution, the rise of the AI company presents unique opportunities.
The continent has long faced barriers to technology adoption, including high costs and limited access to advanced infrastructure. DeepSeek’s affordable and efficient models could democratize AI access, enabling businesses, governments, and individuals leverage AI for various applications.
From improving healthcare and education to enhancing agricultural productivity, the potential for transformative impact is immense. Moreover, DeepSeek’s approach challenges the perception that cutting-edge technology is the exclusive domain of wealthier nations, offering a model for how innovation can be both inclusive and impactful.
The Chinese company’s story also highlights the importance of fostering local innovation ecosystems. Africa’s growing pool of tech talent and startups could benefit from adopting similar approaches to AI development. By prioritizing efficiency and cost-effectiveness, African innovators could address local challenges while competing on a global scale.
The company’s achievements have also sparked debates about the future of AI and the role of smaller players in shaping the industry. Some analysts argue that DeepSeek’s success could lead to a more decentralized AI landscape, where power and influence are distributed across a broader range of stakeholders. This shift could have profound implications for innovation, competition, and the ethical use of AI, particularly in regions like Africa where the need for equitable technology solutions is critical.
Total Comments: 0