Back

DeepSeek unveils new AI reasoning method as anticipation for its next-gen model rises

DeepSeek unveils new AI reasoning method as anticipation for its next-gen model rises

Chinese AI start-up DeepSeek has unveiled an innovative method to enhance the reasoning abilities of large language models (LLMs), as anticipation builds for the release of its next-generation model.


In collaboration with researchers from Tsinghua University, DeepSeek introduced a dual technique that merges generative reward modelling (GRM) with self-principled critique tuning, according to a paper published Friday. This combined approach is designed to improve the speed and accuracy of LLMs when responding to general queries.


The resulting DeepSeek-GRM models demonstrated superior performance, reportedly achieving results on par with top publicly available reward models. Reward modelling helps guide LLMs to align more closely with human preferences.


While the researchers noted that DeepSeek plans to open-source the GRM models, they did not provide a specific timeline.


The academic paper, available on arXiv, comes amid speculation over DeepSeek’s next steps following the widespread attention received by its V3 foundation model and R1 reasoning model.


According to Reuters, the company’s next-generation model, DeepSeek-R2, could launch as early as this month, as the start-up seeks to build on its growing momentum. The debut of DeepSeek-R1 previously made waves in the global tech community for offering high performance at a relatively low cost.


However, DeepSeek has remained silent on the rumored R2 release. While the company has not made any official public statements, a customer service representative reportedly denied the release timing in a group chat with business clients, according to Chinese media reports.


DeepSeek did not immediately respond to a request for comment on Friday.



Hangzhou-based DeepSeek, founded in 2023 by entrepreneur Liang Wenfeng, has been in the global spotlight in recent months but has largely maintained a low public profile, choosing instead to concentrate on research and development.


Just last month, the company released an upgraded version of its V3 model, named DeepSeek-V3-0324, which it claims features “enhanced reasoning capabilities, improved front-end web development, and better Chinese writing proficiency.”


In February, DeepSeek open-sourced five of its code repositories, inviting developers to explore and contribute to its software. The company pledged to make “sincere progress with full transparency.”


Also in February, founder Liang published a technical study on “native sparse attention,” a technique aimed at increasing the efficiency of LLMs when handling large volumes of data.


Liang, 40, is also the founder of High-Flyer Quant, the hedge fund backing DeepSeek’s technological advancements with substantial financial support.


In late February, Liang participated in a symposium for tech entrepreneurs hosted by Chinese President Xi Jinping in Beijing. DeepSeek was praised during the event as a symbol of China’s resilience in the face of U.S. efforts to curtail the nation’s AI development.


Share:
...