‘A sports car at half price’

TU professor Kersting assesses the significance of the DeepSeek-R1 AI language model

2025/09/22

Training large language models is costly and resource-intensive. At the beginning of the year, Chinese provider DeepSeek introduced a so-called reasoning language model that achieved results similar to those of established models but required fewer resources for training and operation. TU Professor Kristian Kersting from the Department of Computer Science has now commented in a detailed statement for the Science Media Centre Germany (SMC) on the effects of training without human feedback and the advantages and disadvantages of the model.

In his article, Kersting sees the release of the DeepSeek-R1 AI model in early 2025 as a turning point in the development of large language models and draws parallels with the automotive industry: DeepSeek is reminiscent of a sports car with Ferrari performance – but at half the price. The decisive factor here is not only performance, but above all the efficiency of the training process: the model learns largely without time-consuming human feedback through trial and reward.

‘In 2024, the development of large language models still seemed to be stagnating: more data and computing power brought hardly any noticeable progress,’ says Kersting. DeepSeek-R1 has brought movement to the field of research. Since then, the focus has shifted away from data volumes and towards thought processes, training time and methodological finesse. Even major players such as OpenAI and Google have adapted their strategies as a result. The fact that the renowned journal Nature subsequently included the underlying study is an indication of the scientific relevance of the model beyond mere product announcements.

Kersting emphasises that although US models continue to dominate, DeepSeek-R1 has established itself as a highly efficient alternative. It proves that clever training methods can be more important than pure computing power. In addition, the model has proven itself as a reference for research and the open source scene. Its potential is particularly evident in programming support: AI assistants based on DeepSeek variants could not only improve code quality, but also lower the barriers to entry into software development.

‘Humans remain important’

Kersting also comments on the training approach. DeepSeek-R1 has triggered a new trend: AI systems are increasingly learning from each other – machine feedback is increasingly replacing human evaluation. ‘However, humans remain important for ensuring quality, copyright, security and style.’

‘Reinforcement learning is probably the key tool at present for enabling machines to “think” better,’ says Kersting. “But research is going much further. For example, the Cluster of Excellence “Reasonable AI” at TU Darmstadt is working on extensions to the basic idea of DeepSeek-R1: the researchers are developing a new type of AI that combines knowledge with logical thinking and continuous learning – an AI that adapts to a constantly changing world, similar to biological systems. This is creating a new generation of adaptive and flexible AI.”

Kersting does not rule out risks. So-called ‘reward hacking’ is a particular challenge: AI models could learn to trick reward systems without actually solving the task at hand. Just as taxi drivers might take deliberate detours to charge higher fares, even though good service should be rewarded, language models could give seemingly helpful but incorrect answers. In DeepSeek-R1-Zero, this has manifested itself in language mixtures and repetition loops, among other things. To counteract such effects, multi-stage tests and human reviews are now necessary. ‘There is still a lot to do,’ Kersting concludes.

Kersting heads the Machine Learning Lab at TU Darmstadt. He is also a founding member and co-director of the Hessian Centre for Artificial Intelligence (hessian.AI), and a member of the DFKI Lab at TU Darmstadt.

About the excellence strategy of the federal and state governments

The Excellence Strategy is a funding programme run by the federal and state governments to strengthen cutting-edge research in Germany. In order to receive funding, applicants must undergo a highly competitive, multi-stage selection process.

The Excellence Strategy comprises two funding lines that build on each other. In the ‘Clusters of Excellence’ funding line, coordinated by the German Research Foundation (DFG), internationally competitive research areas at German universities receive project-based funding.

Two research projects at TU Darmstadt are being funded as clusters of excellence: Reasonable Artificial Intelligence (RAI) and The Adaptive Mind (TAM), a joint application with Justus Liebig University Giessen and Philipps University Marburg.

The aim of the ‘Universities of Excellence’ funding line is to support German universities in expanding their international leadership in research, either as individual institutions or as consortia.

The Technical University of Darmstadt, together with Goethe University Frankfurt and Johannes Gutenberg University Mainz, has applied for the title of ‘University of Excellence’ as the Rhine-Main Universities (RMU) alliance.

SMC/cst