The goal of his project is to develop methods that make artificial neural networks in computer vision, particularly so-called deep networks, more robust and more explainable. One particular aim is to increase user trust in machine learning approaches to computer vision, for example in the context of autonomous vehicles.
Robust image understanding for autonomous driving
Autonomous vehicles need to assess and understand their surroundings based on camera and sensor data within fractions of a second in order to be able to react correctly. In this context, not only is the tolerance for errors very small, but computer vision methods also need to yield reliable results even in adverse viewing conditions such as in bad weather. To reach such levels of robustness, enormous amounts of training data would be currently needed, which have to be painstakingly annotated by hand. Furthermore, autonomous vehicles will also have to be able to cope with rare situations, which may not have been foreseen during their development.
„Current machine learning approaches to computer vision are optimized to yield fast and accurate results in relatively constrained settings. In practice, it is often necessary to obtain reliable results even when the approaches are taken outside of the originally envisioned settings. Moreover, they need to be applicable when only small amounts of training data are available”, explains Stefan Roth, Professor of Computer Science and head of the Visual Inference Lab at TU Darmstadt. In addition, current deep learning approaches rarely quantify how reliable their predictions are. Yet this is an important prerequisite for gaining the trust of future users.
Using artificial neural networks
With his research in the RED project, Stefan Roth aims to improve the use of artificial neural networks in computer vision significantly. He and his team want to increase the robustness of these methods, thereby broadening their applicability. At the same time, they will research which parts of the deployed networks take what role in their final output. The goal is to improve the understanding of such deep networks and enable the reliable quantification of the uncertainty of their predictions.
The project’s work program is based on a number of concrete problems from various areas of computer vision, with a focus on 3D scene analysis from images and videos, including tasks such as semantic segmentation, 3D reconstruction, and motion estimation.
Perspectives for machine learning and AI
The project ultimately aims to create a toolbox with architectures, algorithms, and best practices for deep neural networks that enable their use in computer vision applications in which robustness is key, data is limited, and user trust is paramount.
„We research foundational aspects of neural architectures in computer vision. It is quite possible that our results can be transferred to other application areas of machine learning and AI“, says Roth.