In the context of model-based reinforcement learning and control, a large number of methods for learning system dynamics have been proposed in recent years. The purpose of these learned models is to synthesize new control policies. An important open question is how robust current dynamics-learning methods are to shifts in the data distribution due to changes in the control policy. We present a real-robot dataset which allows to systematically investigate this question. This dataset contains trajectories of a 3 degrees-of-freedom (DOF) robot being controlled by a diverse set of policies. For comparison, we also provide a simulated version of the dataset. Finally, we benchmark a few widely-used dynamics-learning methods using the proposed dataset. Our results show that the iid test error of a learned model is not necessarily a good indicator of its accuracy under control policies different from the one which generated the training data. This suggests that it may be important to evaluate dynamics-learning methods in terms of their transfer performance, rather than only their iid error.