Learning Robust Representations for World Models without Reward Signals

Published in 18th European Workshop on Reinforcement Learning (Tüebingen, Germany), 2025

Fig.

Learning accurate and generalizable world models is a central challenge in model-based reinforcement learning (MBRL), particularly in reward-free settings where no task-specific supervision is available. In this paper, we investigate how different unsupervised objectives, including reconstruction, inverse dynamics, and contrastive learning, capture distinct components of the observation space, such as noise, background, controllable dynamics, and slow-changing factors. Building on this understanding, we introduce a hybrid representation learning approach that integrates the strengths of multiple objectives to better capture predictable and task-relevant structure. We design a controlled shape-based environment with disentangled latent factors to evaluate the robustness and utility of learned representations. Empirical results show that our method yields more informative and generalizable representations.

Recommended citation: Zhang, Zeqiang; Wurzberger, Fabian; Gottwald, Sebastian; Braun, Daniel. (2025). "Learning Robust Representations for World Models without Reward Signals." 18th European Workshop on Reinforcement Learning. Tüebingen, Germany.
Download Paper

Share on

Bluesky Facebook LinkedIn Mastodon X (formerly Twitter)

Zeqiang Zhang

Share on