Fine-Tuning Offline Reinforcement Learning with Model-Based Policy Optimization

Abstract: In offline reinforcement learning (RL), we attempt to learn a control policy from a fixed dataset of environment interactions. This setting has the potential benefit of allowing us to learn effective policies without needing to collect additional interactive data, which can be expensive or dangerous in real-world systems. However, traditional off-policy RL methods tend [...]

MSR Thesis Talk: Zhipeng Bao

Title: Introducing Generative Models to Facilitate Multi-Task Visual Learning Abstract: Motivated by multi-task learning of shared feature representations, this talk considers a novel problem of learning a shared generative model that can facilitate multi-task learning. We present two systems to utilize generative modeling for other visual tasks. The first system focuses on learning a generative [...]