FAIR presented Droidlet , an open-source platform aimed at simplifying the integration of a wide range of modern machine learning algorithms with the field of robotics. Droidlet allows you to test various computer vision algorithms on robots or replace one natural language processing model with another.
Droidlet is a modular, heterogeneous architecture of intelligent agents and a platform for their creation, located at the intersection of natural language processing, computer vision and robotics. Droidlet allows researchers to create agents that can perform complex tasks both in the real world and in simulated environments such as Minecraft or Habitat. Agents can be trained on static or dynamic data, if necessary.
The high-level Droidlet agent consists of four interfaces:
- a memory system that acts as an information hub for the remaining agent modules;
- a set of perception modules (for example, object detection or pose estimation) that process information from the outside world and store it in memory;
- a set of low-level agents, such as “move three meters forward” and” place an object in your hands at the specified coordinates”;
- a controller that, depending on the state of the memory system, decides which tasks to perform.
Droidlet also allows researchers to use the same intelligent agent on different robots, changing tasks and perception modules in accordance with the requirements of the architecture and the presence/absence of certain sensors in each robot.
Droidlet is complemented by an interactive dashboard that researchers can use as a working interface when creating agents. It includes debugging and visualization tools, as well as an interface for correcting agent errors in real time. The monitoring system also allows researchers to add new widgets and tools.