Christopher McFadden of Interesting Engineering reported that researchers from Wuhan University have recently developed a new framework that could help robots manipulate objects more easily. Introduced in a new paper on arXiv, this approach should enable humanoid robots to grasp and handle a greater variety of objects than is currently possible.
At present, humanoid robots are great at tasks like using tools, grasping, and walking, but they suffer from inherent limitations. In most cases, they can fail tasks when an object changes shape or when lighting changes.
They can also struggle completing tasks the robot hasn’t been specifically trained to do. It is this lack of generalization that is widely seen as one of the technology’s major limitations.
To help overcome this, the Wuhan team set out to develop what it calls the recurrent geometric-prior multimodal policy, RGMP for short. This framework is designed to help humanoid robots have a kind of in-built common sense about things like shapes and space.
It also provides robots with a means to better select required skills for a task, and a more data-efficient way to learn movement patterns.
The goal of it, ultimately, is to help robots pick the right action and adapt in new environments with far less training data than before. According to the team, RGMP consists of two main key parts.
The first is called the Geometric-Prior Skill Selector (GSS), which helps the robot decide which of its "tools" and skills is best suited to a task. Using things like its cameras, the robot can use GSS to work out an object’s shape, size, and orientation.
With this information in hand (so to speak), the robot can then work out what needs to be done to complete a given task (i.e, pick up, push, grip, hold with two hands, etc.).
The second is called Adaptive Recursive Gaussian Network (ARGN). Once the robot picks a skill, the ARGN helps the robot actually perform the task. It achieves this by modelling spatial relationships between the robot and the object.
It can also help predict movements step-by-step, and is extremely data-efficient (needs far fewer training examples than typical deep learning methods).
This combination of ARGN and GSS helps robots better complete tasks without needing thousands of demonstrations and training. In testing, robots using the framework were able to achieve an impressive 87 percent success rate in novel tasks that the robots had no experience in completing.
The team also found that the framework is around 5 times more data-efficient than current diffusion-policy-based models (which are currently state-of-the-art). This is impressive and could be very important in the future.
If robots can reliably manipulate objects without being retrained for each new situation, they can actually be used in tasks like helping around the home to clean, tidy, and perhaps even cook.

0 comments
Post a Comment