Publications

[SmartBot26] T-800: An 800 Hz Data Glove for Precise Hand Gesture Tracking

Human dexterity relies on rapid, sub-second motor adjustments, yet capturing these high-frequency dynamics remains an enduring …

Haoyang Luo, Zihang Zhao, Leiyao Cui, Saiyao Zhang, Liu Yang, Zhi Han, Xiyuan Tang, Yixin Zhu

[SmartBot26] T-800: An 800 Hz Data Glove for Precise Hand Gesture Tracking

[IJCV26] AlphaChimp: Tracking and Behavior Recognition of Chimpanzees

Understanding non-human primate behavior is essential for advancing animal welfare and uncovering the roots of human sociality. …

Xiaoxuan Ma, Yutang Lin, Yuan Xu, Stephan Kaufhold, Jack Terwilliger, Andres Meza, Yixin Zhu, Federico Rossano, Yizhou Wang

[IJCV26] AlphaChimp: Tracking and Behavior Recognition of Chimpanzees

[CogSci26] Timing is Everything: Temporal Scaffolding of Semantic Surprise in Humor

Humor is a fundamental cognitive phenomenon in which humans derive pleasure from the expectation violations and their resolution, …

Yuxi Ma (Yuki), Yongqian Peng, Junchen Lyu, Chi Zhang, Yixin Zhu

[CogSci26] Rational Communication Shapes Morphological Composition

Human languages expand vocabularies by combining existing morphemes rather than inventing arbitrary forms. Communicative efficiency …

Fengyuan Yang, Yongqian Peng, Yuxi Ma (Yuki), Chenheng Xu, Yixin Zhu

[CogSci26] Rational Communication Shapes Morphological Composition

[CogSci26] Multi-Level Narrative Evaluation Outperforms Lexical Features for Mental Health

How people narrate their experiences offers a window into how the mind organizes them. Computational approaches to therapeutic writing …

Yuxi Ma (Yuki), Jieming Cui, Muyang Li, Ye Zhao, Yu Li, Chi Zhang, Yinyin Zang, Yixin Zhu

[CogSci26] Multi-Level Narrative Evaluation Outperforms Lexical Features for Mental Health

[CogSci26] Overhang Tower: Resource-Rational Adaptation in Sequential Physical Planning

Humans effortlessly navigate the physical world by predicting how objects behave under gravity and contact forces, yet how such …

Ruihong Shen, Shiqian Li, Yixin Zhu

[CogSci26] Overhang Tower: Resource-Rational Adaptation in Sequential Physical Planning

[CogSci26] Grounding Before Generalizing: How AI Differs from Humans in Causal Transfer

Extracting abstract causal structures and applying them to novel situations is a hallmark of human intelligence (Griffiths & …

Liangru Xiang, Yuxi Ma (Yuki), Zhihao Cao, Yixin Zhu, Song-Chun Zhu

[CogSci26] Grounding Before Generalizing: How AI Differs from Humans in Causal Transfer

[T-ASE26] TacMan-Turbo: Proactive Tactile Control for Robust and Efficient Articulated Object Manipulation

Adept manipulation of articulated objects is essential for robots to operate successfully in human environments. Such manipulation …

Zihang Zhao, Zhenghao Qi, Yuyang Li, Leiyao Cui, Zhi Han, Lecheng Ruan, Yixin Zhu

[ACL-Findings26] Code over Words: Overcoming Semantic Inertia via Code-Grounded Reasoning

LLMs struggle with Semantic Inertia: the inability to inhibit pre-trained priors (e.g., “Lava is Dangerous”) when dynamic, in-context …

Manjie Xu, Isabella Yin, Xinyi Tu, Chi Zhang, Yixin Zhu

[ACL-Findings26] Code over Words: Overcoming Semantic Inertia via Code-Grounded Reasoning

[CVPR26] MotionMaster: Generalizable Text-Driven Motion Generation and Editing

Synthesizing realistic human motion from natural language holds transformative potential for animation, robotics, and virtual reality. …

Nan Jiang, Yunhao Li, Lexi Pang, Zimo He, Siyuan Huang, Yixin Zhu

[CVPR26] MotionMaster: Generalizable Text-Driven Motion Generation and Editing

[CVPR26] Scalable Trajectory Generation for Whole-Body Mobile Manipulation

Robots deployed in unstructured environments must coordinate whole-body motion—simultaneously moving a mobile base and arm—to interact …

Yida Niu, Xinhai Chang, Xin Liu, Ziyuan Jiao, Yixin Zhu

[ICRA26] Vi-TacMan: Articulated Object Manipulation via Vision and Touch

Autonomous manipulation of articulated objects remains a fundamental challenge for robots in human environments. Vision-based methods …

Leiyao Cui, Zihang Zhao, Sirui Xie, Wenhuan Zhang, Zhi Han, Yixin Zhu

[ICRA26] Vi-TacMan: Articulated Object Manipulation via Vision and Touch

[ICLR26] Learning physics-grounded 4D dynamics with neural gaussian force fields

Predicting physical dynamics from raw visual data remains a major challenge in AI. While recent video generation models have achieved …

Shiqian Li, Ruihong Shen, Junfeng Ni, Chang Pan, Chi Zhang, Yixin Zhu

[ICLR26] Neural Force Field: Few-shot Learning of Generalized Physical Reasoning

Physical reasoning is a remarkable human ability that enables rapid learning and generalization from limited experience. Current AI …

Shiqian Li, Ruihong Shen, Yaoyu Tao, Chi Zhang, Yixin Zhu

[CHI26] NarrativeLoom: Enhancing Creative Storytelling through Multi-Persona Collaborative Improvisation

Large Language Models show promise for AI-assisted storytelling, yet current tools often generate predictable, unoriginal narratives. …

Yuxi Ma (Yuki), Yongqian Peng, Fengyuan Yang, Siyu Zha, Chi Zhang, Zixia Jia, Zilong Zheng, Yixin Zhu

[NeurIPS25] Heterogeneous Adversarial Play in Interactive Environments

Self-play constitutes a fundamental paradigm for autonomous skill acquisition, whereby agents iteratively enhance their capabilities …

Manjie Xu, Xinyi Yang, Jiayu Zhan, Wei Liang, Chi Zhang, Yixin Zhu

[NeurIPS25] Heterogeneous Adversarial Play in Interactive Environments

[NeurIPS25] DrivAerStar: An Industrial-Grade CFD Dataset for Vehicle Aerodynamic Optimization

Vehicle aerodynamics optimization has become critical for automotive electrification, where drag reduction directly determines electric …

Jiyan Qiu, Lyulin Kuang, Guan Wang, Yichen Xu, Leiyao Cui, Shaotong Fu, Yixin Zhu, Ruihua Zhang

[NeurIPS25] DrivAerStar: An Industrial-Grade CFD Dataset for Vehicle Aerodynamic Optimization

[T-RO25] Integration of Robot and Scene Kinematics for Sequential Mobile Manipulation Planning

We present a Sequential Mobile Manipulation Planning (SMMP) framework that can solve long-horizon multi-step mobile manipulation tasks …

Ziyuan Jiao, Yida Niu, Zeyu Zhang, Yangyang Wu, Yao Su, Yixin Zhu, Hangxin Liu, Song-Chun Zhu

[T-RO25] Integration of Robot and Scene Kinematics for Sequential Mobile Manipulation Planning

[RA-L25] B*: Efficient and Optimal Base Placement for Fixed-Base Manipulators

Proper base placement is crucial for task execution feasibility and performance of fixed-base manipulators, the dominant solution in …

Zihang Zhao, Leiyao Cui, Sirui Xie, Saiyao Zhang, Zhi Han, Lecheng Ruan, Yixin Zhu

[RA-L25] B*: Efficient and Optimal Base Placement for Fixed-Base Manipulators

[CoRL25] CLONE: Closed-Loop Whole-Body Humanoid Teleoperation for Long-Horizon Tasks

Humanoid robot teleoperation plays a vital role in demonstrating and collecting data for complex interactions. Current methods suffer …

Yixuan Li, Yutang Lin, Jieming Cui, Tengyu Liu, Wei Liang, Yixin Zhu, Siyuan Huang

[CoRL25] CLONE: Closed-Loop Whole-Body Humanoid Teleoperation for Long-Horizon Tasks

[IROS25] Ag2x2: Robust Agent-Agnostic Visual Representations for Zero-Shot Bimanual Manipulation

Bimanual manipulation, fundamental to human daily activities, remains a challenging task due to its inherent complexity of coordinated …

Ziyin Xiong, Yinghan Chen, Puhao Li, Yixin Zhu, Tengyu Liu, Siyuan Huang

[NatureMachineIntelligence25] Embedding high-resolution touch across robotic hands enables adaptive human-like grasping

Developing robotic hands that adapt to real-world dynamics remains a fundamental challenge in robotics and machine intelligence. …

Zihang Zhao, Wanlin Li, Yuyang Li, Tengyu Liu, Boren Li, Meng Wang, Kai Du, Hangxin Liu, Yixin Zhu, Qining Wang, Kaspar Althoefer, Song-Chun Zhu

[NatureMachineIntelligence25] Embedding high-resolution touch across robotic hands enables adaptive human-like grasping

[CogSci25] Probing and Inducing Combinational Creativity in Vision-Language Models

The ability to combine existing concepts into novel ideas stands as a fundamental hallmark of human intelligence. Recent advances in …

Yongqian Peng, Yuxi Ma (Yuki), Mengmeng Wang, Yuxuan Wang, Yizhou Wang, Chi Zhang, Yixin Zhu, Zilong Zheng

[CogSci25] Probing and Inducing Combinational Creativity in Vision-Language Models

[CogSci25] Word Embeddings Track Social Group Changes Across 70 Years in China

Language encodes societal beliefs about social groups through word patterns. While computational methods like word embeddings enable …

Yuxi Ma (Yuki), Yongqian Peng, Yixin Zhu

[CogSci25] A simulation-heuristics dual-process model for intuitive physics

The role of mental simulation in human physical reasoning is widely acknowledged, but whether it is employed across scenarios with …

Shiqian Li, Yuxi Ma (Yuki), Jiajun Yan, Bo Dai, Yujia Peng, Chi Zhang, Yixin Zhu

[T-RO25] Tac-Man: Tactile-Informed Prior-Free Manipulation of Articulated Objects

Integrating robotics into human-centric environments such as homes, necessitates advanced manipulation skills as robotic devices will …

Zihang Zhao, Yuyang Li, Wanlin Li, Zhenghao Qi, Lecheng Ruan, Yixin Zhu, Kaspar Althoefer

[RA-L24] MiniTac: An Ultra-Compact 8 mm Vision-Based Tactile Sensor for Enhanced Palpation in Robot-Assisted Minimally Invasive Surgery

Robot-assisted minimally invasive surgery (RAMIS) provides substantial benefits over traditional open and laparoscopic methods. …

Wanlin Li, Zihang Zhao, Leiyao Cui, Weiyi Zhang, Hangxin Liu, Li-an Li, Yixin Zhu

[RA-L24] MiniTac: An Ultra-Compact 8 mm Vision-Based Tactile Sensor for Enhanced Palpation in Robot-Assisted Minimally Invasive Surgery

[NeurIPS24] PhyRecon: Physically Plausible Neural Scene Reconstruction

Neural implicit representations have gained popularity in multi-view 3D reconstruction. However, most previous work struggles to yield …

Junfeng Ni, Yixin Chen, Bohan Jing, Nan Jiang, Bin Wang, Bo Dai, Puhao Li, Yixin Zhu, Song-Chun Zhu, Siyuan Huang

[NeurIPS24] PhyRecon: Physically Plausible Neural Scene Reconstruction

[SIGGRAPHAsia24] Autonomous Character-Scene Interaction Synthesis from Text Instruction

Synthesizing human motions in 3D environments, particularly those with complex activities such as locomotion, hand-reaching, and …

Nan Jiang, Zimo He, Zi Wang, Hongjie Li, Yixin Chen, Siyuan Huang, Yixin Zhu

[SIGGRAPHAsia24] Autonomous Character-Scene Interaction Synthesis from Text Instruction

[IROS24] Ag2Manip: Learning Novel Manipulation Skills with Agent-Agnostic Visual and Action Representations

Autonomous robotic systems capable of learning novel manipulation tasks are poised to transform industries from manufacturing to …

Puhao Li, Tengyu Liu, Yuyang Li, Muzhi Han, Haoran Geng, Shu Wang, Yixin Zhu, Song-Chun Zhu, Siyuan Huang

[ECCV24] Zero-Shot Image Feature Consensus with Deep Functional Maps

Correspondences emerge from large-scale vision models trained for generative and discriminative tasks. This has been revealed and …

Xinle Cheng, Congyue Deng, Adam Harley, Yixin Zhu, Leonidas Guibas

[CogSci24] Evaluating and Modeling Social Intelligence: A Comparative Study of Human and AI Capabilities

Facing the current debate on whether Large Language Models (LLMs) attain near-human intelligence levels (Mitchell & Krakauer, 2023; …

Junqi Wang, Chunhui Zhang, Jiapeng Li, Yuxi Ma (Yuki), Lixing Niu, Jiaheng Han, Yujia Peng, Yixin Zhu, Lifeng Fan

[CogSci24] Evaluating and Modeling Social Intelligence: A Comparative Study of Human and AI Capabilities

[ScienceAdvances24] Human-level few-shot concept induction through minimax entropy learning

Humans learn concepts both from labeled supervision and by unsupervised observation of patterns, a process machines are being taught to …

Chi Zhang, Baoxiong Jia, Yixin Zhu, Song-Chun Zhu

[ScienceAdvances24] Human-level few-shot concept induction through minimax entropy learning

[CVPR24] AnySkill: Learning Open-Vocabulary Physical Skill for Interactive Agents

Traditional approaches in physics-based motion generation, centered around imitation learning and reward shaping, often struggle to …

Jieming Cui, Tengyu Liu, Nian Liu, Yaodong Yang, Yixin Zhu, Siyuan Huang

[CVPR24] AnySkill: Learning Open-Vocabulary Physical Skill for Interactive Agents

[RA-L24] Grasp Multiple Objects with One Hand

The intricate kinematics of the human hand enable simultaneous grasping and manipulation of multiple objects, essential for tasks such …

Yuyang Li, Bo Liu, Yiran Geng, Puhao Li, Yaodong Yang, Yixin Zhu, Tengyu Liu, Siyuan Huang

[3DV24] Single-view 3D Scene Reconstruction with High-fidelity Shape and Texture

Reconstructing detailed 3D scenes from single-view images remains a challenging task due to limitations in existing approaches, which …

Yixin Chen, Junfeng Ni, Nan Jiang, Yaowei Zhang, Yixin Zhu, Siyuan Huang

[ICLR24] I-PHYRE: Interactive Physical Reasoning

Current evaluation protocols predominantly assess physical reasoning in stationary scenes, creating a gap in evaluating agents’ …

Shiqian Li, Kewen Wu, Chi Zhang, Yixin Zhu

[ICLR24] SparseDFF: Sparse-View Feature Distillation for One-Shot Dexterous Manipulation

Humans demonstrate remarkable skill in transferring manipulation abilities across objects of varying shapes, poses, and appearances, a …

Qianxu Wang, Haotong Zhang, Congyue Deng, Yang You, Hao Dong, Yixin Zhu, Leonidas Guibas

[ICLR24] Neural-Symbolic Recursive Machine for Systematic Generalization

Current learning models often struggle with human-like systematic generalization, particularly in learning compositional rules from …

Qing Li, Yixin Zhu, Yitao Liang, Ying Nian Wu, Song-Chun Zhu, Siyuan Huang

[NeurIPS23] ProBio: A Protocol-guided Multimodal Dataset for Molecular Biology Lab

The challenge of replicating research results has posed a significant impediment to the field of molecular biology. The advent of …

Jieming Cui, Ziren Gong, Baoxiong Jia, Siyuan Huang, Zilong Zheng, Jianzhu Ma, Yixin Zhu

[NeurIPS23] Active Reasoning in an Open-World Environment

Recent advances in vision-language learning have achieved notable success on complete-information question-answering datasets through …

Manjie Xu, Guangyuan Jiang, Wei Liang, Chi Zhang, Yixin Zhu

[NeurIPS23] Active Reasoning in an Open-World Environment

[NeurIPS23] Evaluating and Inducing Personality in Pre-trained Language Models

Standardized and quantified evaluation of machine behaviors is a crux of understanding LLMs. In this study, we draw inspiration from …

Guangyuan Jiang, Manjie Xu, Song-Chun Zhu, Wenjuan Han, Chi Zhang, Yixin Zhu

[NeurIPS23] Evaluating and Inducing Personality in Pre-trained Language Models

[NeurIPS23] Interactive Visual Reasoning under Uncertainty

One of the fundamental cognitive abilities of humans is to quickly resolve uncertainty by generating hypotheses and testing them via …

Manjie Xu, Guangyuan Jiang, Wei Liang, Chi Zhang, Yixin Zhu

[NeurIPS23] Interactive Visual Reasoning under Uncertainty

[IROS23] Learning a Causal Transition Model for Object Cutting

Cutting objects into desired fragments is challenging for robots due to the spatially unstructured nature of fragments and the complex …

Zeyu Zhang, Muzhi Han, Baoxiong Jia, Ziyuan Jiao, Yixin Zhu, Song-Chun Zhu, Hangxin Liu

[IROS23] Learning a Causal Transition Model for Object Cutting

[IROS23] Part-level Scene Reconstruction Affords Robot Interaction

Existing methods for reconstructing interactive scenes primarily focus on replacing reconstructed objects with CAD models retrieved …

Zeyu Zhang, Lexing Zhang, Zaijin Wang, Ziyuan Jiao, Muzhi Han, Yixin Zhu, Song-Chun Zhu, Hangxin Liu

[IROS23] Part-level Scene Reconstruction Affords Robot Interaction

[IROS23] Sequential Manipulation Planning for Over-actuated Unmanned Aerial Manipulators

We investigate the sequential manipulation planning problem for unmanned aerial manipulators (UAMs). Unlike prior work that primarily …

Yao Su, Jiarui Li, Ziyuan Jiao, Meng Wang, Chi Chu, Hang Li, Yixin Zhu, Hangxin Liu

[IROS23] Sequential Manipulation Planning for Over-actuated Unmanned Aerial Manipulators

[ICML23] On the Complexity of Bayesian Generalization

We examine concept generalization at a large scale in the natural visual spectrum. Established computational modes (i.e., rule-based or …

Yu-Zhe Shi, Manjie Xu, John E. Hopcroft, Kun He, Joshua B. Tenenbaum, Song-Chun Zhu, Ying Nian Wu, Wenjuan Han, Yixin Zhu

[ICML23] On the Complexity of Bayesian Generalization

[AIR22] Artificial Social Intelligence: A Comparative and Holistic View

In addition to a physical comprehension of the world, humans possess a high social intelligence–the intelligence that senses …

Lifeng Fan, Manjie Xu, Zhihao Cao, Yixin Zhu, Song-Chun Zhu

[ICLR23] A Minimalist Dataset for Systematic Generalization of Perception, Syntax, and Semantics

Inspired by humans’ exceptional ability to master arithmetic and generalize to new problems, we present a new dataset, …

Qing Li, Siyuan Huang, Yining Hong, Yixin Zhu, Ying Nian Wu, Song-Chun Zhu

[ICLR23] A Minimalist Dataset for Systematic Generalization of Perception, Syntax, and Semantics

[ICRA23] GenDexGrasp: Generalizable Dexterous Grasping

Generating dexterous grasping has been a long-standing and challenging robotic task. Despite recent progress, existing methods …

Puhao Li, Tengyu Liu, Yuyang Li, Yiran Geng, Yixin Zhu, Yaodong Yang, Siyuan Huang

[ICRA23] Rearrange Indoor Scenes for Human-Robot Co-Activity

We present an optimization-based framework for rearranging indoor furniture to accommodate human-robot co-activities better. The …

Weiqi Wang, Zihang Zhao, Ziyuan Jiao, Yixin Zhu, Song-Chun Zhu, Hangxin Liu

[ICRA23] Rearrange Indoor Scenes for Human-Robot Co-Activity

[Engineering23] A Reconfigurable Data Glove for Reconstructing Physical and Virtual Grasps

In this work, we present a reconfigurable data glove design to capture different modes of human hand-object interactions, which are …

Hangxin Liu, Zeyu Zhang, Ziyuan Jiao, Zhenliang Zhang, Minchen Li, Chenfanfu Jiang, Yixin Zhu, Song-Chun Zhu

[Engineering23] A Reconfigurable Data Glove for Reconstructing Physical and Virtual Grasps

[IJCV22] Scene Reconstruction with Functional Objects for Robot Autonomy

In this paper, we rethink the problem of scene reconstruction from an embodied agent’s perspective: While the classic view focuses on …

Muzhi Han, Zeyu Zhang, Ziyuan Jiao, Xu Xie, Yixin Zhu, Song-Chun Zhu, Hangxin Liu

[IJCV22] Scene Reconstruction with Functional Objects for Robot Autonomy

[ECCV22] Learning Algebraic Representation for Systematic Generalization in Abstract Reasoning

Is intelligence realized by connectionist or classicist? While connectionist approaches have achieved superhuman performance, there has …

Chi Zhang, Sirui Xie, Baoxiong Jia, Ying Nian Wu, Song-Chun Zhu, Yixin Zhu

[ECCV22] Learning Algebraic Representation for Systematic Generalization in Abstract Reasoning

[ECCV22Workshop] PartAfford: Part-level Affordance Discovery from 3D Objects

Understanding what objects could furnish for humans-namely, learning object affordance-is the crux to bridge perception and action. In …

Chao Xu, Yixin Chen, He Wang, Song-Chun Zhu, Yixin Zhu, Siyuan Huang

[IROS22] Sequential Manipulation Planning on Scene Graph

We devise a 3D scene graph representation, contact graph+ (cg+), for efficient sequential manipulation planning. Augmented with …

Ziyuan Jiao, Yida Niu, Zeyu Zhang, Song-Chun Zhu, Yixin Zhu, Hangxin Liu

[IROS22] Sequential Manipulation Planning on Scene Graph

[IROS22] Downwash-aware Control Allocation for Over-actuated UAV Platforms

Tracking position and orientation independently affords more agile maneuver for over-actuated multirotor Unmanned Aerial Vehicles …

Yao Su, Chi Chu, Meng Wang, Jiarui Li, Liu Yang, Yixin Zhu, Hangxin Liu

[IROS22] Downwash-aware Control Allocation for Over-actuated UAV Platforms

[RA-L/IROS22] Understanding Physical Effects for Effective Tool-use

We present a robot learning and planning framework that produces an effective tool-use strategy with the least joint efforts, capable …

Zeyu Zhang, Ziyuan Jiao, Weiqi Wang, Yixin Zhu, Song-Chun Zhu, Hangxin Liu

[RA-L/IROS22] Understanding Physical Effects for Effective Tool-use

[ICML22] Latent Diffusion Energy-Based Model for Interpretable Text Modeling

Latent space Energy-Based Models (EBMs), also known as energy-based priors, have drawn growing interests in generative modeling. Fueled …

Peiyu Yu, Sirui Xie, Xiaojian Ma, Baoxiong Jia, Bo Pang, Ruiqi Gao, Yixin Zhu, Ying Nian Wu, Song-Chun Zhu

[ICML22] Latent Diffusion Energy-Based Model for Interpretable Text Modeling

[JCB2022] Sharing Rewards Undermines Coordinated Hunting

Coordinated hunting is widely observed in animals, and sharing rewards is often considered a major incentive for its success. While …

Minglu Zhao, Ning Tang, Annya Dahmani, Yixin Zhu, Federico Rossano, Tao Gao

[JCB2022] Sharing Rewards Undermines Coordinated Hunting

[CogSci22] What Is the Point? A Theory of Mind Model of Relevance

Although pointing is sparse, overloaded, and indirect, it allows humans to effectively decode shared information, (ex)change their …

Kaiwen Jiang, Annya Dahmani, Stephanie Stacy, Boxuan Jiang, Federico Rossano, Yixin Zhu, Tao Gao

[RA-L/ICRA22] Object Gathering with a Tethered Robot Duo

We devise a cooperative planning framework to generate optimal trajectories for a tethered robot duo, who is tasked to gather scattered …

Yao Su, Yuhong Jiang, Yixin Zhu, Hangxin Liu

[AAIL21] Patching interpretable And-Or-Graph knowledge representation using augmented reality

We present a novel augmented reality (AR) interface to provide effective means to diagnose a robot’s erroneous behaviors, endow …

Hangxin Liu, Yixin Zhu, Song-Chun Zhu

[AAIL21] Patching interpretable And-Or-Graph knowledge representation using augmented reality

[NeurIPS21] Unsupervised Foreground Extraction via Deep Region Competition

We present Deep Region Competition (DRC), an algorithm designed to extract foreground objects from images in a fully unsupervised …

Peiyu Yu, Sirui Xie, Xiaojian Ma, Yixin Zhu, Ying Nian Wu, Song-Chun Zhu

[RA-L21] Synthesizing Diverse and Physically Stable Grasps with Arbitrary Hand Structures using Differentiable Force Closure Estimator

Existing grasp synthesis methods are either analytical or data-driven. The former one is oftentimes limited to specific application …

Tengyu Liu, Zeyu Liu, Ziyuan Jiao, Yixin Zhu, Song-Chun Zhu

[RA-L21] Synthesizing Diverse and Physically Stable Grasps with Arbitrary Hand Structures using Differentiable Force Closure Estimator

[ICCV21] YouRefIt: Embodied Reference Understanding with Language and Gesture

We study the machine’s understanding of embodied reference: One agent uses both language and gesture to refer to an object to …

Yixin Chen, Qing Li, Deqian Kong, Yik Lun Kei, Song-Chun Zhu, Tao Gao, Yixin Zhu, Siyuan Huang

[ICCV21] YouRefIt: Embodied Reference Understanding with Language and Gesture

[ICCV21] Spatio-temporal Self-Supervised Representation Learning for 3D Point Clouds

To date, various 3D scene understanding tasks still lack practical and generalizable pre-trained models, primarily due to the intricate …

Siyuan Huang, Yichen Xie, Song-Chun Zhu, Yixin Zhu

[ICCV21] Spatio-temporal Self-Supervised Representation Learning for 3D Point Clouds

[IROS21] Consolidating Kinematic Models to Promote Coordinated Mobile Manipulations

We construct a Virtual Kinematic Chain (VKC) that readily consolidates the kinematics of the mobile base, the arm, and the object to be …

Ziyuan Jiao, Zeyu Zhang, Xin Jiang, David Han, Song-Chun Zhu, Yixin Zhu, Hangxin Liu

[IROS21] Consolidating Kinematic Models to Promote Coordinated Mobile Manipulations

[IROS21] Efficient Task Planning for Mobile Manipulation: a Virtual Kinematic Chain Perspective

We present a Virtual Kinematic Chain (VKC) perspective, a simple yet effective method, to improve task planning efficacy for mobile …

Ziyuan Jiao, Zeyu Zhang, Weiqi Wang, David Han, Song-Chun Zhu, Yixin Zhu, Hangxin Liu

[IROS21] Efficient Task Planning for Mobile Manipulation: a Virtual Kinematic Chain Perspective

[IROS21] Communicative Learning with Natural Gestures for Embodied Navigation Agents with Human-in-the-Scene

Human-robot collaboration is an essential research topic in artificial intelligence (AI), enabling researchers to devise cognitive AI …

Qi Wu, Cheng-Ju Wu, Yixin Zhu, Jungseock Joo

[IROS21] Communicative Learning with Natural Gestures for Embodied Navigation Agents with Human-in-the-Scene

[CogSci21] Individual vs. Joint Perception: a Pragmatic Model of Pointing as Communicative Smithian Helping

The simple gesture of pointing can greatly augment one’s ability to comprehend states of the world based on observations. It …

Kaiwen Jiang, Stephanie Stacy, Chuyu Wei, Adelpha Chan, Federico Rossano, Yixin Zhu, Tao Gao

[CogSci21] Individual vs. Joint Perception: a Pragmatic Model of Pointing as Communicative Smithian Helping

[ICLR21Workshop] HALMA: Humanlike Abstraction Learning Meets Affordance in Rapid Problem Solving

Humans learn compositional and causal abstraction, i.e., knowledge, in response to the structure of naturalistic tasks. When presented …

Sirui Xie, Xiaojian Ma, Peiyu Yu, Yixin Zhu, Ying Nian Wu, Song-Chun Zhu

[ICLR21Workshop] HALMA: Humanlike Abstraction Learning Meets Affordance in Rapid Problem Solving

[ACL-Findings21] GRICE: A Grammar-based Dataset for Recovering Implicature and Conversational rEasoning

Understanding what we genuinely mean instead of what we literally say in conversations is challenging for both humans and machines; …

Zilong Zheng, Shuwen Qiu, Lifeng Fan, Yixin Zhu, Song-Chun Zhu

[ACL-Findings21] GRICE: A Grammar-based Dataset for Recovering Implicature and Conversational rEasoning

[CVPR21] Learning Triadic Belief Dynamics in Nonverbal Communication from Videos

Humans possess a unique social cognition capability; nonverbal communication can convey rich social information among agents. In …

Lifeng Fan, Shuwen Qiu, Zilong Zheng, Tao Gao, Song-Chun Zhu, Yixin Zhu

[CVPR21] Learning Triadic Belief Dynamics in Nonverbal Communication from Videos

[CVPR21] ACRE: Abstract Causal Reasoning Beyond Covariation

Causal induction, i.e., identifying unobservable mechanisms that lead to the observable relations among variables, has played a pivotal …

Chi Zhang, Baoxiong Jia, Mark Edmonds, Song-Chun Zhu, Yixin Zhu

[CVPR21] Abstract Spatial-Temporal Reasoning via Probabilistic Abduction and Execution

Spatial-temporal reasoning is a challenging task in Artificial Intelligence (AI) due to its demanding but unique nature: a theoretic …

Chi Zhang, Baoxiong Jia, Song-Chun Zhu, Yixin Zhu

[CVPR21] Abstract Spatial-Temporal Reasoning via Probabilistic Abduction and Execution

[ICRA21] Reconstructing Interactive 3D Scenes by Panoptic Mapping and CAD Model Alignments

In this paper, we rethink the problem of scene reconstruction from an embodied agent’s perspective: While the classic view …

Muzhi Han, Zeyu Zhang, Ziyuan Jiao, Xu Xie, Yixin Zhu, Song-Chun Zhu, Hangxin Liu

[ICRA21] Reconstructing Interactive 3D Scenes by Panoptic Mapping and CAD Model Alignments

[ICRA21] Congestion-aware Multi-agent Trajectory Prediction for Collision Avoidance

Predicting agents’ future trajectories plays a crucial role in modern AI systems, yet it is challenging due to intricate …

Xu Xie, Chi Zhang, Yixin Zhu, Ying Nian Wu, Song-Chun Zhu

[IJNME21] Lagrangian‐Eulerian Multi‐Density Topology Optimization with the Material Point Method

In this paper, a hybrid Lagrangian‐Eulerian topology optimization (LETO) method is proposed to solve the elastic force equilibrium with …

Yue Li, Xuan Li, Minchen Li, Yixin Zhu, Bo Zhu, Chenfanfu Jiang

[IJNME21] Lagrangian‐Eulerian Multi‐Density Topology Optimization with the Material Point Method

[ECCV20] LEMMA: A Multi-view Dataset for Learning Multi-agent Multi-task Activities

Understanding and interpreting human actions is a long-standing challenge and a critical indicator of perception in artificial …

Baoxiong Jia, Yixin Chen, Siyuan Huang, Yixin Zhu, Song-Chun Zhu

[ECCV20] LEMMA: A Multi-view Dataset for Learning Multi-agent Multi-task Activities

[IROS20] Human-Robot Interaction in a Shared Augmented Reality Workspace

We design and develop a new shared Augmented Reality (AR) workspace for Human-Robot Interaction (HRI), which establishes a …

Shuwen Qiu, Hangxin Liu, Zeyu Zhang, Yixin Zhu, Song-Chun Zhu

[IROS20] Human-Robot Interaction in a Shared Augmented Reality Workspace

[IROS20] Graph-based Hierarchical Knowledge Representation for Robot Task Transfer from Virtual to Physical World

We study the hierarchical knowledge transfer problem using a cloth-folding task, wherein the agent is first given a set of human …

Zhenliang Zhang, Yixin Zhu, Song-Chun Zhu

[IROS20] Graph-based Hierarchical Knowledge Representation for Robot Task Transfer from Virtual to Physical World

[Engineering20] Dark, Beyond Deep: A Paradigm Shift to Cognitive AI with Humanlike Common Sense

Recent progress in deep learning is essentially based on a “big data for small tasks” paradigm, under which massive amounts …

Yixin Zhu, Tao Gao, Lifeng Fan, Siyuan Huang, Mark Edmonds, Hangxin Liu, Feng Gao, Chi Zhang, Siyuan Qi, Ying Nian Wu, Joshua B. Tenenbaum, Song-Chun Zhu

[Engineering20] Dark, Beyond Deep: A Paradigm Shift to Cognitive AI with Humanlike Common Sense

[SIGGRAPH20] IQ-MPM: An Interface Quadrature Material Point Method for Non-sticky Strongly Two-Way Coupled Nonlinear Solids and Fluids

We propose a novel scheme for simulating two-way coupled interactions between nonlinear elastic solids and incompressible fluids. The …

Yu Fang, Ziyin Qu, Minchen Li, Xinxin Zhang, Yixin Zhu, Mridul Aanjaneya, Chenfanfu Jiang

[SIGGRAPH20] IQ-MPM: An Interface Quadrature Material Point Method for Non-sticky Strongly Two-Way Coupled Nonlinear Solids and Fluids

[ICRA20] Joint Inference of States, Robot Knowledge, and Human (False-)Beliefs

Aiming to understand how human (false-)belief—a core socio-cognitive ability—would affect human interactions with robots, …

Tao Yuan, Hangxin Liu, Lifeng Fan, Zilong Zheng, Tao Gao, Yixin Zhu, Song-Chun Zhu

[ICRA20] Joint Inference of States, Robot Knowledge, and Human (False-)Beliefs

[ICRA20] Congestion-aware Evacuation Routing using Augmented Reality Devices

We present a congestion-aware routing solution for indoor evacuation, which produces real-time individual-customized evacuation routes …

Zeyu Zhang, Hangxin Liu, Ziyuan Jiao, Yixin Zhu, Song-Chun Zhu

[ICRA20] Congestion-aware Evacuation Routing using Augmented Reality Devices

[AAAI20] Theory-based Causal Transfer: Integrating Instance-level Induction and Abstract-level Structure Learning

Learning transferable knowledge across similar but different settings is a fundamental component of generalized intelligence. In this …

Mark Edmonds, Xiaojian Ma, Siyuan Qi, Yixin Zhu, Hongjing Lu, Song-Chun Zhu

[AAAI20] Theory-based Causal Transfer: Integrating Instance-level Induction and Abstract-level Structure Learning

[AAAI20] Machine Number Sense: A Dataset of Visual Arithmetic Problems for Abstract and Relational Reasoning

As a comprehensive indicator of mathematical thinking and intelligence, the number sense (Dehaene 2011) bridges the induction of …

Wenhe Zhang, Chi Zhang, Yixin Zhu, Song-Chun Zhu

[AAAI20] Machine Number Sense: A Dataset of Visual Arithmetic Problems for Abstract and Relational Reasoning

[NeurIPS19] PerspectiveNet: 3D Object Detection from a Single RGB Image via Perspective Points

Detecting 3D objects from a single RGB image is intrinsically ambiguous, thus requiring appropriate prior knowledge and intermediate …

Siyuan Huang, Yixin Chen, Tao Yuan, Siyuan Qi, Yixin Zhu, Song-Chun Zhu

[NeurIPS19] PerspectiveNet: 3D Object Detection from a Single RGB Image via Perspective Points

[NeurIPS19] Learning Perceptual Inference by Contrasting

‘Thinking in pictures,’ [1] i.e., spatial-temporal reasoning, effortless and instantaneous for humans, is believed to be a …

Chi Zhang, Baoxiong Jia, Feng Gao, Yixin Zhu, Hongjing Lu, Song-Chun Zhu

[ICCV19] Holistic++ Scene Understanding: Single-view 3D Holistic Scene Parsing and Human Pose Estimation with Human-Object Interaction and Physical Commonsense

We propose a new 3D holistic++ scene understanding problem, which jointly tackles two tasks from a single-view image: (i) holistic …

Yixin Chen, Siyuan Huang, Tao Yuan, Yixin Zhu, Siyuan Qi, Song-Chun Zhu

[ICCV19] Holistic++ Scene Understanding: Single-view 3D Holistic Scene Parsing and Human Pose Estimation with Human-Object Interaction and Physical Commonsense

[IROS19] Learning Virtual Grasp with Failed Demonstrations via Bayesian Inverse Reinforcement Learning

We propose Bayesian Inverse Reinforcement Learning with Failure (BIRLF), which makes use of failed demonstrations that were often …

Xu Xie, Changyang Li, Chi Zhang, Yixin Zhu, Song-Chun Zhu

[IROS19] Learning Virtual Grasp with Failed Demonstrations via Bayesian Inverse Reinforcement Learning

[CogSci19] Decomposing Human Causal Learning: Bottom-up Associative Learning and Top-down Schema Reasoning

Transfer learning is fundamental for intelligence; agents expected to operate in novel and unfamiliar environments must be able to …

Mark Edmonds, Siyuan Qi, Yixin Zhu, James Kubricht, Song-Chun Zhu, Hongjing Lu

[CogSci19] Decomposing Human Causal Learning: Bottom-up Associative Learning and Top-down Schema Reasoning

[TURC19] VRGym: A Virtual Testbed for Physical and Interactive AI

We propose VRGym, a virtual reality testbed for realistic human-robot interaction. Different from existing toolkits and virtual reality …

Xu Xie, Hangxin Liu, Zhenliang Zhang, Yuxing Qiu, Feng Gao, Siyuan Qi, Yixin Zhu, Song-Chun Zhu

[TURC19] VRGym: A Virtual Testbed for Physical and Interactive AI

[CVPR19] RAVEN: A Dataset for Relational and Analogical Visual Reasoning

Dramatic progress has been witnessed in basic vision tasks involving low-level perception, such as object recognition, detection, and …

Chi Zhang, Feng Gao, Baoxiong Jia, Yixin Zhu, Song-Chun Zhu

[CVPR19] RAVEN: A Dataset for Relational and Analogical Visual Reasoning

[ICRA19] Self-Supervised Incremental Learning for Sound Source Localization in Complex Indoor Environment

This paper presents an incremental learning framework for mobile robots localizing the human sound source using a microphone array in a …

Hangxin Liu, Zeyu Zhang, Yixin Zhu, Song-Chun Zhu

[ICRA19] Self-Supervised Incremental Learning for Sound Source Localization in Complex Indoor Environment

[ICRA19] High-Fidelity Grasping in Virtual Reality using a Glove-based System

This paper presents a design that jointly provides hand pose sensing, hand localization, and haptic feedback to facilitate real-time …

Hangxin Liu, Zhenliang Zhang, Xu Xie, Yixin Zhu, Yue Liu, Yongtian Wang, Song-Chun Zhu

[ICRA19] High-Fidelity Grasping in Virtual Reality using a Glove-based System

[AAAI19] Mirroring without Overimitation: Learning Functionally Equivalent Manipulation Actions

This paper presents a mirroring approach, inspired by the neuroscience discovery of the mirror neurons, to transfer demonstrated …

Hangxin Liu, Chi Zhang, Yixin Zhu, Chenfanfu Jiang, Song-Chun Zhu

[AAAI19] Mirroring without Overimitation: Learning Functionally Equivalent Manipulation Actions

[AAAI19] MetaStyle: Three-Way Trade-Off Among Speed, Flexibility and Quality in Neural Style Transfer

An unprecedented booming has been witnessed in the research area of artistic style transfer ever since Gatys et.al. introduced the …

Chi Zhang, Yixin Zhu, Song-Chun Zhu

[AAAI19] MetaStyle: Three-Way Trade-Off Among Speed, Flexibility and Quality in Neural Style Transfer

[NeurIPS18] Cooperative Holistic Scene Understanding: Unifying 3D Object, Layout, and Camera Pose Estimation

Holistic 3D indoor scene understanding refers to jointly recovering the i) object bounding boxes, ii) room layout, and iii) camera …

Siyuan Huang, Siyuan Qi, Yinxue Xiao, Yixin Zhu, Ying Nian Wu, Song-Chun Zhu

[NeurIPS18] Cooperative Holistic Scene Understanding: Unifying 3D Object, Layout, and Camera Pose Estimation

[ECCV18] Holistic 3D Scene Parsing and Reconstruction from a Single RGB Image

We propose a computational framework to jointly parse a single RGB image and reconstruct a holistic 3D configuration composed by a set …

Siyuan Huang, Siyuan Qi, Yixin Zhu, Yinxue Xiao, Yuanlu Xu, Song-Chun Zhu

[ECCV18] Holistic 3D Scene Parsing and Reconstruction from a Single RGB Image

[IJCV18] Configurable 3D Scene Synthesis and 2D Image Rendering with Per-Pixel Ground Truth using Stochastic Grammars

We propose a systematic learning-based approach to the generation of massive quantities of synthetic 3D scenes and arbitrary numbers of …

Chenfanfu Jiang, Siyuan Qi, Yixin Zhu, Siyuan Huang, Jenny Lin, Lap-Fai Yu, Demetri Terzopoulos, Song-Chun Zhu

[IJCV18] Configurable 3D Scene Synthesis and 2D Image Rendering with Per-Pixel Ground Truth using Stochastic Grammars

[CogSci18] Human Causal Transfer: Challenges for Deep Reinforcement Learning

Discovery and application of causal knowledge in novel problem contexts is a prime example of human intelligence. As new information is …

Mark Edmonds, James Kubricht, Colin Summers, Yixin Zhu, Brandon Rothrock, Song-Chun Zhu, Hongjing Lu

[CogSci18] Human Causal Transfer: Challenges for Deep Reinforcement Learning

[SIGGRAPH18] A Moving Least Squares Material Point Method with Displacement Discontinuity and Two-Way Rigid Body Coupling

In this paper, we introduce the Moving Least Squares Material Point Method (MLS-MPM). MLS-MPM naturally leads to the formulation of …

Yuanming Hu, Yu Fang, Ziheng Ge, Ziyin Qu, Yixin Zhu, Andre Pradhana, Chenfanfu Jiang

[CVPR18] Human-centric Indoor Scene Synthesis Using Stochastic Grammar

We present a human-centric method to sample and synthesize 3D room layouts and 2D images thereof, for the purpose of obtaining …

Siyuan Qi, Yixin Zhu, Siyuan Huang, Chenfanfu Jiang, Song-Chun Zhu

[CVPR18] Human-centric Indoor Scene Synthesis Using Stochastic Grammar

[ICRA18] Unsupervised Learning of Hierarchical Models for Hand-Object Interactions

Contact forces of the hand are visually unobservable, but play a crucial role in understanding hand-object interactions. In this paper, …

Xu Xie, Hangxin Liu, Mark Edmonds, Feng Gao, Siyuan Qi, Yixin Zhu, Brandon Rothrock, Song-Chun Zhu

[ICRA18] Unsupervised Learning of Hierarchical Models for Hand-Object Interactions

[ICRA18] Interactive Robot Knowledge Patching using Augmented Reality

We present a novel Augmented Reality (AR) approach, through Microsoft HoloLens, to address the challenging problems of diagnosing, …

Hangxin Liu, Yaofang Zhang, Wenwen Si, Xu Xie, Yixin Zhu, Song-Chun Zhu

[ICRA18] Interactive Robot Knowledge Patching using Augmented Reality

[AAAI18] Tracking Occluded Objects and Recovering Incomplete Trajectories by Reasoning about Containment Relations and Human Actions

This paper studies a challenging problem of tracking severely occluded objects in long video sequences. The proposed method reasons …

Wei Liang, Yixin Zhu, Song-Chun Zhu

[IROS17] Feeling the Force: Integrating Force and Pose for Fluent Discovery through Imitation Learning to Open Medicine Bottles

Learning complex robot manipulation policies for real-world objects is challenging, often requiring significant tuning within …

Mark Edmonds, Feng Gao, Xu Xie, Hangxin Liu, Siyuan Qi, Yixin Zhu, Brandon Rothrock, Song-Chun Zhu

[IROS17] Feeling the Force: Integrating Force and Pose for Fluent Discovery through Imitation Learning to Open Medicine Bottles

[IROS17] A Glove-based System for Studying Hand-Object Manipulation via Joint Pose and Force Sensing

We present a design of an easy-to-replicate glove-based system that can reliably perform simultaneous hand pose and force sensing in …

Hangxin Liu, Xu Xie, Mark Edmonds, Feng Gao, Yixin Zhu, Veronica Santos, Brandon Rothrock, Song-Chun Zhu

[IROS17] A Glove-based System for Studying Hand-Object Manipulation via Joint Pose and Force Sensing

[CogSci17] Consistent Probabilistic Simulation Underlying Human Judgment in Substance Dynamics

A growing body of evidence supports the hypothesis that humans infer future states of perceived physical situations by propagating …

James Kubricht, Yixin Zhu, Chenfanfu Jiang, Demetri Terzopoulos, Song-Chun Zhu, Hongjing Lu

[CogSci17] Consistent Probabilistic Simulation Underlying Human Judgment in Substance Dynamics

[TVCG17] The Martian: Examining Human Physical Judgments Across Virtual Gravity Fields

This paper examines how humans adapt to novel physical situations with unknown gravitational acceleration in immersive virtual …

Tian Ye, Siyuan Qi, James Kubricht, Yixin Zhu, Hongjing Lu, Song-Chun Zhu

[SIGGRAPHAsia16Workshop] A Virtual Reality Platform for Dynamic Human-Scene Interaction

Both synthetic static and simulated dynamic 3D scene data is highly useful in the fields of computer vision and robot task planning. …

Jenny Lin, Xingwen Guo, Jingyu Shao, Chenfanfu Jiang, Yixin Zhu, Song-Chun Zhu

[SIGGRAPHAsia16Workshop] A Virtual Reality Platform for Dynamic Human-Scene Interaction

[IJCAI16] What is Where: Inferring Containment Relations from Videos

In this paper, we present a probabilistic approach to explicitly infer containment relations between objects in 3D scenes. Given an …

Wei Liang, Yibiao Zhao, Yixin Zhu, Song-Chun Zhu

[CVPR16] Inferring Forces and Learning Human Utilities From Videos

We propose a notion of affordance that takes into account physical quantities generated when the human body interacts with real-world …

Yixin Zhu, Chenfanfu Jiang, Yibiao Zhao, Demetri Terzopoulos, Song-Chun Zhu

[CVPR16] Inferring Forces and Learning Human Utilities From Videos

[CogSci16] Probabilistic Simulation Predicts Human Performance on Viscous Fluid-Pouring Problem

The physical behavior of moving fluids is highly complex, yet people are able to interact with them in their everyday lives with …

James Kubricht, Chenfanfu Jiang, Yixin Zhu, Song-Chun Zhu, Demetri Terzopoulos, Hongjing Lu

[CogSci16] Probabilistic Simulation Predicts Human Performance on Viscous Fluid-Pouring Problem

[CVPR15] Understanding Tools: Task-Oriented Object Modeling, Learning and Recognition

In this paper, we present a new framework for task-oriented object modeling, learning and recognition. The framework include: i) …

Yixin Zhu, Yibiao Zhao, Song-Chun Zhu

[CVPR15] Understanding Tools: Task-Oriented Object Modeling, Learning and Recognition

[CogSci15] Evaluating Human Cognition of Containing Relations with Physical Simulation

Containers are ubiquitous in daily life. By container, we consider any physical object that can contain other objects, such as bowls, …

Wei Liang, Yibiao Zhao, Yixin Zhu, Song-Chun Zhu

[CogSci15] Evaluating Human Cognition of Containing Relations with Physical Simulation