RoboDuet: A Framework Affording Mobile-Manipulation and Cross-Embodiment


Guoping Pan* 1         Qingwei Ben* 1,3         Zhecheng Yuan1,2         Guangqi Jiang2,4        
Yandong Ji5         Jiangmiao Pang3         Houde Liu1         Huazhe Xu1,2,3        
1 Tsinghua University      2 Shanghai Qi Zhi Institute      3 Shanghai AI Lab      4 Sichuan University      5 UC San Diego
* Equal conttribution
arXiv Paper Code (coming soon)

Abstract

Combining the mobility of legged robots with the manipulation skills of arms has the potential to significantly expand the operational range and enhance the capabilities of robotic systems in performing various mobile manipulation tasks. Existing approaches are confined to imprecise 6-DoF manipulation and possess a limited arm workspace. In this paper, we propose a novel framework — RoboDuet, which employs two collaborative policies to realize locomotion and manipulation simultaneously, achieving whole-body control through interactions between each other. Surprisingly, going beyond the large-range pose tracking, we find that the two-policy framework may enable cross-embodiment deployment such as using different quadrupedal robots or other arms. Our experiments demonstrate that the policies trained through RoboDuet can accomplish stable gaits, agile 6D end-effector pose tracking, and zero-shot exchange of legged robots, and can be deployed in the real world to perform various mobile manipulation tasks.

Method

Cooperative policy for whole-body control. RoboDuet consists of a loco policy for locomotion and an arm policy for manipulation. The two policies are harmonized as a whole-body controller. Specifically, the loco policy adjusts its actions accordingly by following instructions from the arm policy. The goal of the loco policy \( \pi_{loco} \) is to follow a target command \(\mathbf{c_t} \). The goal of the arm policy \( \pi_{arm} \) is to accurately track the 6-DoF pose. The actions of the arm policy consist of two parts: the first six actions \(a^{arm^J}_t \in \mathbb{R}^6\) represent the target joint position offsets corresponding to six arm joint actuators. The rest part of the arm policy \( a_t^{arm^G} \) is used to replace orientation commands, providing additional degrees of freedom for end-effector tracking to cooperate with the loco policy.

Two stage training. In order to achieve both robust locomotion ability and flexible manipulation ability, we adopted a two-stage training strategy. Stage 1 focuses on obtaining the robust locomotion capability, which design is inspired by the powerful blind locomotion algorithm. Stage 2 aims to coordinate locomotion and manipulation to achieve whole-body large-range mobile manipulation, when the arm policy will be activated simultaneously with all the robotic arm joints.

Pipeline Image
An overview of RoboDuet

Experiments

Baselines and Metrics

To validate the significance of the two-stage training and the cooperative policy, which are key components of RoboDuet, we establish a Baseline algorithm training a unified policy in one-stage. The Two-Stage algorithm modifies this baseline by transitioning from one-stage to two-stage training, while the Cooperated algorithm builds on the baseline by replacing the unified policy with a cooperative policy. RoboDuet itself incorporates both two-stage training and cooperative policy. Training details for these algorithms can be found in our paper. Metrics of various training methods are shown below (scaled by \( 10^{-2} \)). The initial three categories measure mean errors in robot velocity and end-effector position/orientation. The fourth assesses the robot's survival rate against external forces, and the fifth evaluates the robot workspace. "Still" tests occur with \( vel_x \) and \( \omega_z \) at zero, while "Move" tests are within command range limits.

Baselines and Metrics

Whole body control


Discrete Commands Following


More videos are coming soon...




BibTeX

@inproceedings{pan2024roboduet,
    title={RoboDuet: A Framework Affording Mobile-Manipulation and Cross-Embodiment}, 
    author={Guoping Pan and Qingwei Ben and Zhecheng Yuan and Guangqi Jiang and Yandong Ji and Jiangmiao Pang and Houde Liu and Huazhe Xu},
    year={2024},
    eprint={2403.17367},
    archivePrefix={arXiv},
    primaryClass={cs.RO}
}