Julia Vinogradska, research scientist at the Bosch Center for Artificial Intelligence, stands together with her doctoral supervisor Jan Peters, professor of Intelligent Autonomous Systems at TU Darmstadt, in front of a robot arm suspended on a frame. The robot has a table tennis racket tightly attached to its endeffector as though it were ready for a game. Jan Peters taught the robot how to play tetherball some time back using so-called reinforcement learning. Such machine learning algorithms always perform well in simulation on the professor’s computer. But when Peters transferred it to the robot arm, it played as expected initially but then suddenly drew back powerfully to make a shot with the result that the robot arm smashed into its joint limits with major damages.
Vinogradska looks both at the robot and at her dissertation advisor. She has a lot of experience with learning algorithms that do not always acquire the behaviours which they should – despite many advances in artificial intelligence (AI). One characteristic she has in common with her doctoral supervisor: “If something does not work, I get stubborn,” she said. “I simply have to make it work.” Peters, whose graduates are today pushing ahead with AI projects worldwide, has high hopes in the Ukrainian-born student. Justifiably so: In her highly regarded dissertation that Peters supervised, Vinogradska developed novel reinforcement learning methods with strong guarantees on performance as well as an improved efficiency. Their joint research resulted in three patents.
What is special about Julia’s policy search approaches is that they allow accurately estimating the system uncertainty about its knowledge with regard to the physical world.
Reinforcement learning do their name justice: They reinforce good behaviour. Most artificial intelligence systems employ just supervised learning, which only reproduces the examples shown by a teacher. In contrast, a reinforcement learning system learns from its own mistakes – based on punishment and reward. Not unlike human practice in sports. For example, if a human tries to learn to hit a target using a bow and arrow, they are frustrated if the arrow misses the target and happy if it gets closer. Such feelings are deeply rooted in the brain’s chemistry where neurotransmitters such as dopamine reinforce behaviour.
Robots use reinforcement learning in this way, for example, to learn to walk. If the robot falls over, the algorithm receives negative points as punishment. If the robot manages to move without falling over, it wins positive points. The quest for receiving a high score pushes the robot gradually towards an optimal solution – provided it is given sufficient time. The famous artificial intelligence program AlphaGo taught the board game Go so well with such an approach at world class level -- forcing world’s best human Go player into early retirement.
Reinforcement learning can also used for industrial applications. Here, however, a problem with such machine learning method has become apparent: if the learning is unsuccessful, and the machine reacts similarly extremely as in the case of table tennis, people working in the vicinity risk serious injury. To prevent this behaviour from happening, such systems cannot learn only with data from simulations but requires practical experience. This further disadvantage can make reinforcement learning application expensive.
Learning process is conducted on a real test bed
At Bosch, applications are currently being conducted on a control plant for a so-called throttle valve. The policy regulates the input of the petrol/air mixture into a combustion engine. Optimisation can save a lot of energy – but if something goes wrong, the engine may be damaged. That is why the throttle valve is fitted with a sensor unit. The sensor data feeds into the learning process, and the AI system gradually figures out how to operate the valve more efficiently. During this process, however, millions of data points need to be generated during the interaction between the system and the sensor unit. “In contrast to a Go game that can be simulated, the learning process is conducted on a real test bed, and is, therefore, very expensive” according to Vinogradska.
Her doctoral work and patents focus on data efficiency and the reliability of such AI methods. Her reinforcement learning algorithms are based on numerical quadrature, an approximation of integrals. Her approach ensures that the system learns with as few interactions as possible – while, nevertheless, remaining as reliable as possible.
Assessing the stability and reliability of any technical system working is difficult. Particularly if the solution has been obtained by machine learning. “Such complex systems have an infinite number of states, and it is impossible for us to test them all,” Vinogradska said. Uncertainties remain. Engineers take these manually into account – but typically only based on one specific state. This was too little for Vinogradska.
“What is special about Julia’s policy search approaches is that they allow accurately estimating the system uncertainty about its knowledge with regard to the physical world” according to Peters. With this knowledge the system no longer reacts extremely to changes in the input data. “All methods of machine learning have so far been unable to cope well with major jumps in the input data,” Julia Vinogradska pointed out. And Peters added: “It is very difficult to develop good algorithms for this problem, and Julia’s are outstanding – there is currently no method that is anywhere near as good.”
Resolute in a men’s domain
Julia Vinogradska has made AI methods more safe – and more efficient. Her doctoral work resulted in three patents and numerous publications in distinguished journals such as IEEE Transactions on Pattern Analysis and Machine Intelligence. Moreover, she was awarded the Young Scientists Medal by the Werner von Siemens-Ring foundation. She is currently a research scientist at the Bosch Center for Artificial Intelligence (BAIC) in Renningen near Stuttgart, one of the few centres for basic research worldwide run by a company. BAIC was founded in early 2017 and meanwhile encompasses seven international centres with more than 180 AI experts, all working on making artificial intelligence more useful, more robust and more explainable.
Julia Vinogradska was born in Ukraine. She came to Germany with her parents at the age of nine and later studied mathematics with a minor in computer science at Stuttgart University. “My choice of minor subject was not exactly easy for me.” She explained: “I stood outside a lecture room full of exclusively male students, that put me off – but my father was a software developer and encouraged me.” She actually ended up being the only woman in all computer science lectures that she subsequently attended. “But I still enjoyed my studies tremendously,” she says smilingly.
During undergraduate studies, she focused on algebra and theoretical computer science. She wanted her doctoral work to be more application-oriented. Thus, she applied for an industrial doctorate offered by the company Bosch in cooperation with the TU Darmstadt. She started in a small research group that subsequently became the large research unit of the Bosch Center for Artificial Intelligence. Vinogradska’s research in reinforcement learning is meanwhile one of ten separate areas of research at the Bosch Center for Artificial Intelligence.
Her current aim is to encourage young women not to be deterred by the low share of women in the field. “My experience was thoroughly positive at all times. I have never had the feeling of somehow incurring disadvantages. I am pleased to say I also was never unfairly promoted over others either. I would not have liked that.” Computer science is a multi-faceted subject with an immense range of possibilities – particularly at TU Darmstadt. Julia Vinogradska can recommend both wholeheartedly.