3D PerceptionLab has released a Large-Scale Indoor Dataset of Sequences with Robot Trajectories and Interactions. Under the name RobotriX: An eXtremely Photorealistic and Very-Large-Scale Indoor Dataset of Sequences with Robot Trajectories and Interactions, the new dataset is designed to enable the application of deep learning techniques to a wide variety of robotic vision problems.
The dataset consists of photorealistic scenes and robots that are rendered by Unreal Engine into a virtual reality headset. This allows a human operator to move the robot and use controllers for the robotic hands. In that way, scene information is dumped on a per-frame basis so that it can be reproduced offline to generate raw data and ground truth labels. For each frame, RGB-D and 3D information are provided with full annotations in both spaces.
This dataset represents the largest and most realistic synthetic dataset to date. The hyperrealistic indoor scenes are explored by robot agents which also interact with objects in a visually realistic manner in that simulated world.
RobotriX contains 512 sequences of movements recorded in 16 room layouts at a speed of + 60 frames per second, lasting from one to five minutes. A total of about 8 million individual frames. For each of these frames, 3D poses for cameras, objects, and joints, an RGB image, depth maps, and 2D object masks are provided.
The dataset is available here. The full paper describing the RobotriX project and the dataset is published on arxiv.
As a future work, authors mention adding non-rigid objects which can be simulated with Unreal Engine 4 physics such as elastic bodies, fluids, or clothes for the robots to interact with. Also, they plan to automatically generate semantic descriptions for each frame to provide ground truth for captioning and question answering.