Lumina: Embodied AI Community

CVPR 2025

PartRM: Modeling Part-Level Dynamics with Large Cross-State Reconstruction Model

Mingju Gao ^* , Yike Pan ^* , Huan-ang Gao ^* , Zongzheng Zhang , Wenyi Li , Hao Dong , Hao Tang , Li Yi , Hao Zhao ^†

^† Corresponding Author ^* Equal Contribution

PartRM is a novel 4D reconstruction framework that could serve as general dynamic models for articulated objects. It models appearance, geometry, and part-level motion from multi-view images of static objects, surpassing the limitations of 2D video representations and slow processing times. Key features include: (1) 3D-aware output using Gaussian splattings instead of 2D videos, (2) feed-forward architecture leveraging pretrained large reconstruction models without iterative diffusion denoising, and (3) generalizability, showing strong performance on in-the-wild objects.

Paper | Code | arXiv | Project Page

CVPR 2025 Highlight & ECCV WS Best Paper

RoboTwin: Dual-Arm Robot Benchmark with Generative Digital Twins

Yao Mu ^* , Tianxing Chen ^* , Zanxin Chen ^* , Shijia Peng ^* , Zeyu Gao , Zhixuan Liang , Qiaojun Yu , Yude Zou , Mingkun Xu , Lunkai Lin , Zhiqiang Xie , Mingyu Ding , Ping Luo ^†

^† Corresponding Author ^* Equal Contribution

Using the COBOT Magic platform, we have collected diverse data on tool usage, human-robot interaction, and mobile manipulation. We present a cost-effective approach to creating digital twins using AI-generated content, transforming 2D images into detailed 3D models. Furthermore, we utilize large language models to generate expert-level training data and task-specific pose sequences oriented towards functionality.

Paper | Code | arXiv | Project Page

CVPR 2025

G3Flow: Generative 3D Semantic Flow for Pose-aware and Generalizable Object Manipulation

Tianxing Chen ^* , Yao Mu ^* , Zhixuan Liang ^* , Zanxin Chen , Shijia Peng , Qiangyu Chen , Mingkun Xu , Ruizhen Hu , Hongyuan Zhang , Xuelong Li , Ping Luo ^†

^† Corresponding Author ^* Equal Contribution

We present G3Flow, a novel approach that leverages foundation models to generate and maintain 3D semantic flow for enhanced robotic manipulation.

Paper | Code | arXiv | Project Page

Research

Recent Publications

PartRM: Modeling Part-Level Dynamics with Large Cross-State Reconstruction Model

RoboTwin: Dual-Arm Robot Benchmark with Generative Digital Twins

G3Flow: Generative 3D Semantic Flow for Pose-aware and Generalizable Object Manipulation