Forget ChatGPT? China’s DeepSeek is working on smarter, self-improving AI models
After shaking up Silicon Valley with AI models earlier this year, Chinese startup DeepSeek is working on another innovation to help reduce operational costs. The company, led by Liang Wenfeng, has been working with researchers at Tsinghua University to develop a new approach called generative reward modelling (GRM), which rewards the AI model for following human preferences.
The new approach, first revealed in a pre-print paper (via Bloomberg), discusses the use of a technique called self-principled critique tuning (SPCT) to make AI models smarter and more efficient in a self-improving way.