NVIDIA has demonstrated that large language models can expedite robot training. Robots with four limbs trained using the DrEureka model outperform standard learning systems by 34% in real-world movement speed and 20% in distance covered.
In robot development, they’re typically trained in a virtual environment and then tested in real-world settings. Transferring skills from simulation to reality is one of the most laborious tasks due to unforeseen “perturbations” in the real world that can unpredictably affect robot behavior (e.g., surface tilts relative to the horizon or variations in friction coefficients). Typically, this transfer requires extensive manual tweaking of reward functions and modeling parameters.
The study introduces the DrEureka (Domain Randomization Eureka) technology, which automatically generates reward functions and randomizes virtual environments, introducing random perturbations. DrEureka requires only a high-level task description and transfers learned strategies from the simulated environment to the real world faster and more efficiently than manually developed reward functions.
DrEureka is built on the Eureka technology, introduced in October 2023. Eureka utilizes the description of the robotic task and a language model to create software implementations of a reward function that measures success in performing that task. These reward functions are then run in simulation, and the results are fed back into the language model, which analyzes the outcome and improves the reward function.
To account for potential environmental perturbations, DrEureka automatically adjusts space randomization parameters. DrEureka employs a multi-stage process, allowing for the simultaneous optimization of reward functions and domain randomization parameters.
Researchers evaluated DrEureka on four-limbed robots, although this approach is generalizable and applicable to various robots and tasks. The results demonstrate that robots trained with DrEureka outperform standard human-developed learning systems by 34% in movement speed and 20% in distance covered in real-world conditions. Scientists also tested DrEureka on the dexterity of robotic arms. Over a certain period, the best-performing robot trained with DrEureka executed 300% more cube rotations than a human-trained robot.