DeepSeek Claims Its Reasoning-Focused AI Model Can Outperform OpenAI’s o1

Internet

DeepSeek-R1, a reasoning-focused artificial intelligence (AI) model by the Chinese firm DeepSeek, was released on Monday. This is the full version of the open source AI model, which arrives two months after its preview version was released. The open-source AI model is available to download, and can also be used as a plug-and-play application programming interface (API). The Chinese AI firm claimed that DeepSeek-R1 was able to outperform OpenAI’s o1 model in several benchmarks for mathematics, coding, and reasoning-based tasks.

DeepSeek-R1 AI Models Cost Up to 95 Percent Less than OpenAI’s o1

There are two variants in the latest series — DeepSeek-R1 and DeepSeek-R1-Zero. Both have been distilled from another large language model (LLM) developed by the the AI firm, dubbed DeepSeek V3. The new AI models are based on mixture-of-experts (MoE) architecture, where several smaller models are paired together to improve the efficiency and capabilities of the larger model.

The DeepSeek-R1 AI models are currently available to download via its Hugging Face listing. The model comes with an MIT licence that allows both academic and commercial usage. Those, who do not intend to run the LLM locally, can opt for the model API instead. The company announced the inference pricing of the model, highlighting that these cost 90-95 percent less than OpenAI’s o1.

Currently, the DeepSeek-R1 API comes with an input price of $0.14 (roughly Rs. 12.10) per million tokens and the output price is set at $2.19 (roughly Rs. 189.50) per million tokens. In comparison, OpenAI’s o1 API costs $7.5 (roughly Rs. 649) per million input tokens and $60 (roughly Rs. 5,190) per million output tokens.

Not only does the DeepSeek-R1 cost less, but the company also claims that it offers higher performance than the OpenAI counterpart. Based on internal testing, the AI firm stated that DeepSeek-R1 outperformed o1 in the American Invitational Mathematics Examination (AIME), Math-500, and SWE-bench benchmarks. However, the difference between the models is marginal.

Coming to the post-training, the company said that it used reinforcement learning (RL) to the base model without any supervised fine-tuning (SFT). This method, also known as pure RL, allows more freedom to the model when solving complex problems using the chain-of-thought (CoT) mechanism. DeepSeek claimed that this is the first open-source AI project to use pure RL to improve reasoning capabilities.

For the latest tech news and reviews, follow Gadgets 360 on X, Facebook, WhatsApp, Threads and Google News. For the latest videos on gadgets and tech, subscribe to our YouTube channel. If you want to know everything about top influencers, follow our in-house Who’sThat360 on Instagram and YouTube.


iPhone 17 Back Panel Design Leaked Again; Shows Pixel-Like Rear Camera Module