1 FLAIR, University of Oxford 2 WhiRL, University of Oxford 3 DARK, University College London
Existing benchmarks for open-ended learning are either too slow or too simple. Craftax is both fast and complicated. We hope that this will allow researchers without access to industrial compute to investigate learning in an open-ended environment with an ease that was not previously possible.
Progress in reinforcement learning (RL) algorithms is driven in large part by the development and adoption of suitable
benchmarks. In the effort towards increasingly general agents, there has arisen
a community focused on benchmarks that exhibit more open-ended dynamics
We present Craftax, a JAX-based benchmark that combines elements from Crafter and NetHack, while running orders of magnitude faster. We also present Craftax-Classic, a reimplementation of Crafter in JAX that is significantly simpler than the full Craftax environment, but will provide a starting point for those familiar with Crafter.
Craftax is significantly faster than comparable open-ended environments.
Craftax-Classic and Craftax run 257x and 169x faster than Crafter respectively when running the PureJaxRL
Craftax contains 65 achievements split into 4 difficulties. Achievements with higher difficulty give more reward. Current RL methods failed to make significant progress when given a budget of 1 billion environment steps, with no instances of the two hardest classes of achievement being reached. For perspective, it took one of the authors (with extensive knowledge of the game mechanics) roughly 5 hours of gameplay to first achieve a `perfect' run where every achievement was completed.
Craftax contains a diverse range of different skills and tasks to perform.
This makes it an excellent testbed for methods like exploration, continual learning and unsupervised skill discovery.
Craftax can also be used for unsupervised environment design
Or through evolving levels with ACCEL