Logo

Research

Exploring the frontiers of robotics and artificial intelligence.

Recent Publications

  • Teaser Video
    CVPR 2025

    PartRM: Modeling Part-Level Dynamics with Large Cross-State Reconstruction Model

    Mingju Gao * , Yike Pan * , Huan-ang Gao * , Zongzheng Zhang , Wenyi Li , Hao Dong , Hao Tang , Li Yi , Hao Zhao
    Corresponding Author * Equal Contribution

    PartRM is a novel 4D reconstruction framework that could serve as general dynamic models for articulated objects. It models appearance, geometry, and part-level motion from multi-view images of static objects, surpassing the limitations of 2D video representations and slow processing times. Key features include: (1) 3D-aware output using Gaussian splattings instead of 2D videos, (2) feed-forward architecture leveraging pretrained large reconstruction models without iterative diffusion denoising, and (3) generalizability, showing strong performance on in-the-wild objects.

  • RoboTwin Research
    CVPR 2025 Highlight & ECCV WS Best Paper

    RoboTwin: Dual-Arm Robot Benchmark with Generative Digital Twins

    Yao Mu * , Tianxing Chen * , Zanxin Chen * , Shijia Peng * , Zeyu Gao , Zhixuan Liang , Qiaojun Yu , Yude Zou , Mingkun Xu , Lunkai Lin , Zhiqiang Xie , Mingyu Ding , Ping Luo
    Corresponding Author * Equal Contribution

    Using the COBOT Magic platform, we have collected diverse data on tool usage, human-robot interaction, and mobile manipulation. We present a cost-effective approach to creating digital twins using AI-generated content, transforming 2D images into detailed 3D models. Furthermore, we utilize large language models to generate expert-level training data and task-specific pose sequences oriented towards functionality.

  • G3Flow Research
    CVPR 2025

    G3Flow: Generative 3D Semantic Flow for Pose-aware and Generalizable Object Manipulation

    Tianxing Chen * , Yao Mu * , Zhixuan Liang * , Zanxin Chen , Shijia Peng , Qiangyu Chen , Mingkun Xu , Ruizhen Hu , Hongyuan Zhang , Xuelong Li , Ping Luo
    Corresponding Author * Equal Contribution

    We present G3Flow, a novel approach that leverages foundation models to generate and maintain 3D semantic flow for enhanced robotic manipulation.