Robot overcomes uncertainty to retrieve buried objects

For humans, finding a lost wallet buried under a pile of items is pretty straightforward — we simply remove things from the pile until we find the wallet. But for a robot, this task involves complex reasoning about the pile and objects in it, which presents a steep challenge.

MIT researchers previously demonstrated a robotic arm that combines visual information and radio frequency (RF) signals to find hidden objects that were tagged with RFID tags (which reflect signals sent by an antenna). Building off that work, they have now developed a new system that can efficiently retrieve any object buried in a pile. As long as some items in the pile have RFID tags, the target item does not need to be tagged for the system to recover it.

The algorithms behind the system, known as FuseBot, reason about the probable location and orientation of objects under the pile. Then FuseBot finds the most efficient way to remove obstructing objects and extract the target item. This reasoning enabled FuseBot to find more hidden items than a state-of-the-art robotics system, in half the time.

This speed could be especially useful in an e-commerce warehouse. A robot tasked with processing returns could find items in an unsorted pile more efficiently with the FuseBot system, says senior author Fadel Adib, associate professor in the Department of Electrical Engineering and Computer Science and director of the Signal Kinetics group in the Media Lab.

“What this paper shows, for the first time, is that the mere presence of an RFID-tagged item in the environment makes it much easier for you to achieve other tasks in a more efficient manner. We were able to do this because we added multimodal reasoning to the system — FuseBot can reason about both vision and RF to understand a pile of items,” adds Adib.

Joining Adib on the paper are research assistants Tara Boroushaki, who is the lead author; Laura Dodds; and Nazish Naeem. The research will be presented at the Robotics: Science and Systems conference.

Targeting tags

A recent market report indicates that more than 90 percent of U.S. retailers now use RFID tags, but the technology is not universal, leading to situations in which only some objects within piles are tagged.

This problem inspired the group’s research.

With FuseBot, a robotic arm uses an attached video camera and RF antenna to retrieve an untagged target item from a mixed pile. The system scans the pile with its camera to create a 3D model of the environment. Simultaneously, it sends signals from its antenna to locate RFID tags. These radio waves can pass through most solid surfaces, so the robot can “see” deep into the pile. Since the target item is not tagged, FuseBot knows the item cannot be located at the exact same spot as an RFID tag.

Algorithms fuse this information to update the 3D model of the environment and highlight potential locations of the target item; the robot knows its size and shape. Then the system reasons about the objects in the pile and RFID tag locations to determine which item to remove, with the goal of finding the target item with the fewest moves.

It was challenging to incorporate this reasoning into the system, says Boroushaki.

The robot is unsure how objects are oriented under the pile, or how a squishy item might be deformed by heavier items pressing on it. It overcomes this challenge with probabilistic reasoning, using what it knows about the size and shape of an object and its RFID tag location to model the 3D space that object is likely to occupy.

As it removes items, it also uses reasoning to decide which item would be “best” to remove next.

“If I give a human a pile of items to search, they will most likely remove the biggest item first to see what is underneath it. What the robot does is similar, but it also incorporates RFID information to make a more informed decision. It asks, ‘How much more will it understand about this pile if it removes this item from the surface?’” Boroushaki says.

After it removes an object, the robot scans the pile again and uses new information to optimize its strategy.

Retrieval results

This reasoning, as well as its use of RF signals, gave FuseBot an edge over a state-of-the-art system that used only vision. The team ran more than 180 experimental trials using real robotic arms and piles with household items, like office supplies, stuffed animals, and clothing. They varied the sizes of piles and number of RFID-tagged items in each pile.

FuseBot extracted the target item successfully 95 percent of the time, compared to 84 percent for the other robotic system. It accomplished this using 40 percent fewer moves, and was able to locate and retrieve targeted items more than twice as fast.

“We see a big improvement in the success rate by incorporating this RF information. It was also exciting to see that we were able to match the performance of our previous system, and exceed it in scenarios where the target item didn’t have an RFID tag,” Dodds says.

FuseBot could be applied in a variety of settings because the software that performs its complex reasoning can be implemented on any computer — it just needs to communicate with a robotic arm that has a camera and antenna, Boroushaki adds.

In the near future, the researchers are planning to incorporate more complex models into FuseBot so it performs better on deformable objects. Beyond that, they are interested in exploring different manipulations, such as a robotic arm that pushes items out of the way. Future iterations of the system could also be used with a mobile robot that searches multiple piles for lost objects.

This work was funded, in part, by the National Science Foundation, a Sloan Research Fellowship, NTT DATA, Toppan, Toppan Forms, and the MIT Media Lab.


Substack subscription form sign up