Notebookcheck Logo

Meta's OK-Robot can tidy up a room without any help

OK-Robot's AI system only manages to pick up 58.5 % of objects in particularly untidy homes (symbolic image: DALL-E / AI)
OK-Robot's AI system only manages to pick up 58.5 % of objects in particularly untidy homes (symbolic image: DALL-E / AI)
OK-Robot, a new AI system, enables robots to tidy up homes that are new to them. The system recognises objects and puts them in the right place. OK-Robot still struggles in very cluttered rooms, but Meta's project is much more successful in rooms that are less messy.

The new OK-Robot AI system is designed to enable a wide variety of robots to tidy up rooms that are completely new to them. For example, they can pick up laundry or toys from the floor and place them elsewhere. Other robotic systems are usually designed to operate in a familiar environment.

OK-Robot works with VLMs (Vision-Language Models), a type of AI system that is able to process and understand information from text or direct speech and images at the same time. It is also interesting to note that OK-Robot works with a variety of open-source AI models and has been pre-trained with large data sets that are publicly available.

On the positive side, you don’t have to give the robot any additional training data in the environment, it just works. On the con side, it can only pick an object up and drop it somewhere else. You can’t ask it to open a drawer, because it only knows how to do those two things.

- Lerrel Pinto, Assistant Professor of Computer Science at New York University, who co-led the project

OK-Robot does not only work in the laboratory and has been tested in 10 different rooms (image: arvix)
OK-Robot does not only work in the laboratory and has been tested in 10 different rooms (image: arvix)

The system was tested by researchers from New York University and Meta using the commercial robot Stretch from Hello Robot. 171 pick-and-drop experiments were carried out in different homes. During the experiments, the robot scanned the environment using the Record3D iPhone app to create a 3D video. The OK Robot system then ran an AI object recognition model over each frame of the video.

This enabled the robot to identify all the objects in its environment, such as a table, a sofa, a pair of glasses, a shoe and a lamp. It was then instructed to pick up certain objects, which it did in 82.2 % of the cases, provided that the room was not too cluttered. In rooms that were more chaotic, however, the success rate was significantly lower.

I would say it’s quite unusual to be completely reliant on off-the-shelf models, and that it’s quite impressive to make them work. We’ve seen a revolution in machine learning that has made it possible to create models that work not just in laboratories, but in the open world. Seeing that this actually works in a real physical environment is very useful information.

- Matthias Minderer, a senior computer vision research scientist at Google DeepMind, who was not involved in the project

OK-Robot uses Open Knowledge models such as CLIP, Lang-SAM, AnyGrasp and OWL-ViT (image: arvix)
OK-Robot uses Open Knowledge models such as CLIP, Lang-SAM, AnyGrasp and OWL-ViT (image: arvix)

The system is still a long way from perfection; for example, it sometimes has difficulty understanding speech input, and its grasping model also has problems with some objects. Nevertheless, the project shows that the current models are able to cope relatively well with an open vocabulary and, at the same time, are able to navigate directly to the right objects in unfamiliar spaces.

Sources

MIT Technology Review | VentureBeat | teaser image: symbolic image by DALL-E / AI | images 2,3: arvix

Read all 1 comments / answer
static version load dynamic
Loading Comments
Comment on this article
Please share our article, every link counts!
Nicole Dominikowski, 2024-02-12 (Update: 2024-02-12)