Reinforcement Learning, Control, and Optimization​​

What Motivates Us

Most machine learning, including the majority of deep learning and probabilistic modeling systems, are concerned with making predictions. But in many settings, we want to do more than just make predictions: we want to actually control a car or a manufacturing plant process, rather than merely predict how it will behave. These decision-making tasks are the domain of the reinforcement learning, optimization, and learning-based control techniques.

Our Approach

We are developing new methods in reinforcement learning and optimization that focus on the data efficiency of learning trials. That means techniques that can use data to learn a decision-making problem require as few runs of the real system as possible. For non-sequential tasks, this is accomplished by developing new techniques in Bayesian optimization. For sequential tasks, we are developing new techniques in batch reinforcement learning methods which can learn from data by “reusing” the data from older runs of the system. In the context of robotic applications, it often involves the integration of vision and manipulation using similar techniques that blend aspects of both deep learning for perception and traditional control.

Application

These methods are relevant to any setting where decisions need to be made based upon data, or when a system needs to be controlled based upon closed-loop feedback, including many robotics domains.

Reinforcement Learning & Optimization

While large parts of machine learning focus on understanding this is often only the first step to a successful system. At Bosch, we are building many systems that require not only perception but also need to be controlled in an optimal manner.

Reinforcement learning uses machine learning to learn optimal control schemes from data. The resulting solutions offer automated and fast adaptation to changing condition, automated system application and high-quality control even with limited compute resources.

This technology covers a wide array of use cases including the daily optimization of parameters for manufacturing machines or tuning of complex systems like anti-lock braking system.

Use Case

AI for Controller Learning

Figure 1

Introduction

Imagine you could automate the tuning process for all your controllers! Calibration and tuning of controllers is a reoccurring and costly process. Reinforcement learning (RL) has the potential to fully automate such processes as has been impressively demonstrated in academic research on several highly complex problems. However, our real-world applications put additional requirements on RL, e.g., safety and data efficiency, that hampered its application to real systems to date. For many complex dynamics systems, Bosch has large amounts of prior knowledge such as approximate model knowledge. We are working on solutions to combine this expert knowledge with RL. In particular, we develop methods to reliably learn control of dynamical systems with very low manual effort, few system interactions and guaranteed safety.

Our Research

Maximizing Efficiency

We aim to make the best use of existing expert knowledge. By combining domain knowledge such as approximate models with elaborate model error correction techniques, we can learn to control a dynamical system with only very few interactions. This way, we can typically outperform hand-tuned controllers at a fraction of the involved effort.

Safety First

When learning control, safety constraints are a major real-world challenge. Relying on system interactions to learn about the dynamics, we work on efficient and scalable methods how to select safe and at the same time informative actions during learning.

Understanding is Key

While modern deep RL methods are sometimes described as being notoriously brittle, we believe that there‘s more to that. We work on how to properly represent uncertainty during learning and to systematically understand why different methods perform the way they do.

Complex Real-World Problems

We apply our research to various complex real-world systems at Bosch. There‘s nothing quite like the magic of seeing your research efforts come to life on a real system, potentially improving the lives of thousands of people.

Figure 2

References

Curi, S., Berkenkamp, F., & Krause,A. (2020) Efficient Model-Based Reinforcement Learning through Optimistic Policy Search and Planning. NeurIPS. [PDF]

Dörr, A., Volpp, M., Toussaint, M., Trimpe, S., & Daniel, C. (2019).Trajectory-Based Off-Policy Deep Reinforcement Learning. ICML. [PDF]

Fröhlich, L.P., Lefarov, M., Zeilinger, M.N., & Berkenkamp, F. (2022). On-Policy Model Errors in Reinforcement Learning. ICLR. [PDF]

Vinogradska, J., Bischoff, B., Nguyen-Tuong, D., Schmidt, H., Romer, A., & Peters, J. (2017). Stability of Controllers for Gaussian Process Dynamics. Journal of Machine Learning Research Vol. 18. [PDF]

Perception and Manipulation for Robotics

AI technologies are currently changing the field of robotics fundamentally. AI enables a new generation of robots that is capable to see, to understand the environment, to act skillful and that is easily teachable by humans. Such robotic solutions will significantly expand the possibilities for automation in logistics and manufacturing.

We provide AI solutions for different aspects of robotic automation. This includes 3D perception for detecting and predicting poses of a huge variety of industrial or everyday objects, as well as solution for grasping and sorting them.

Use Case

Smart Item Picking

Introduction

We are working together with Rexroth on Smart Item Picking, a robotic system to support labor-intensive picking tasks in intralogistics. BCAI provides latest AI technologies to make such robots smart and adapting to ever changing conditions. Our contributions include technologies such as detection of unknown objects or prediction of grasp poses. We are also working towards a continuously learning system that improves with experience and learns autonomously how to best grasp or manipulate objects.

Our Research

Detection & Segmentation of Unknown Object

We explore novel methods to get a situation understanding from a 3D vision sensor. The major research challenge is to come up with methods that can work for more than 10,000 different objects without prior information.

Grasping and Manipulation Solutions

We develop easy to use methods for teaching robots where to grasp objects or how to manipulate them best before grasping. Our methods are built to make use of the robot’s experience and to improve continuously. Our advances grasping solution significantly improve the reliability of the Smart Item Picking system in long term operation.

Highest Efficiency through Self-Supervised Training

We strive to overcome the pain of human data-labelling for using machine learning methods in robotic applications. We develop novel approaches to make robots learn in a self-supervised manner, requiring only minimal human intervention.

References

Adrian, D., Kupcsik, A., Spies, M., & Neumann, H. (2022). Efficient and Robust Training of Dense Object Nets for Multi-Object Robot Manipulation. ICRA.

Beik-Mohammadi, H., Hauberg, S., Arvanitidis, G., Neumann, G., & Rozo, L. (2021). Learning Riemannian Manifolds for Geodesic Motion Skills. RSS. [PDF]

Feldman, Z., Ziesche, H., Ngo, V.A., & Di Castro, D. (2022). A Hybrid Approach for Learning to Shift and Grasp with Elaborate Motion Primitives. ICRA. [PDF]

Gao. N.(2022). What Matters for Meta-Learning Vision Regression Tasks. CVPR. [PDF]

Guo M., & Bürger. M. (2022). Interactive Human-in-the-loop Coordination of Manipulation Skills Learned from Demonstration. ICRA. [PDF]

Guo, M., & Bürger, M. (2021). Geometric Task Networks: Learning efficient and explainable skill coordination for object manipulation. IEEE TRO. [PDF]

Hoppe, S., Giftthaler, M., Krug, R., & Toussain, M. (2020). Sample-Efficient Learning for Industrial Assembly using Qgraph-bounded DDPG. IROS. [PDF]

Jaquier, N., Rozo, L., Caldwell, D.G., & Calinon, S. (2020). Geometry-aware manipulability learning, tracking, and transfer. IJRR. [PDF]

Jaquier, N., & Rozo, L. (2020). High-Dimensional Bayesian Optimization via Nested Riemannian Manifolds. NeurIPS. [PDF]

Kupcsik, A., Spies, M., Klein, A., Todescato, M., Waniek, N., Schillinger, P., & Bürger, M. (2021). Supervised Training of Dense Object Nets using Optimal Descriptors for Industrial Robotic Applications. AAAI. [PDF]

Le, A.T., Guo, M., van Duijkeren, N., Rozo, L., Krug, R., Kupcsik, A., & Bürger, M. (2021). Learning forceful manipulation skills from multi-modal human demonstrations. IROS. [PDF]

Otto, F., Becker, P., Anh Vien, N., Ziesche, H. C., & Neumann, G. (2021). Differentiable Trust Region Layers for Deep Reinforcement Learning. ICLR. [PDF]

Rozo, L., & Dave, V. (2021). Orientation Probabilistic Movement Primitives on Riemannian Manifolds. CoRL. [PDF]

Rozo, L., Guo, M., Kupcsik, A., Todescato, M., Schillinger, P., Giftthaler, M., Ochs, M., Spies, M., Waniek, N., Kesper, P., & Büerger, M. (2020). Learning and sequencing of object-centric manipulation skills for industrial tasks. IROS. [PDF]

Shaj, V., & van Duijkeren, N. (2020). Action-Conditional Recurrent Kalman Networks For Forward and Inverse Dynamics Learning. CorL. [PDF]

Academic Collaborations

Learn more

Publications

Learn more

Reinforcement Learning, Control, and Optimization​

What Motivates Us

Our Approach

Application

Reinforcement Learning & Optimization

Use Case

Introduction

Our Research

References ​

Perception and Manipulation for Robotics

Use Case

Introduction

Our Research

References ​

Reinforcement Learning, Control, and Optimization

References

References