Research

Our Research Interests

Our research is driven by advanced Computer Vision and Machine Learning & Robot Vision, equipping machines to truly understand complex real-world environments. Our research spans 3D Reconstruction for precise geometric scene understanding, Social Kinematics for modeling human-agent dynamics, and LLMs & Creative AI for generating physics-based content. Especially, our group's research achievements are listed in below:

Autonomous Driving

  • Trajectory Prediction & Reasoning
  • Motion & Behavior Generation
  • Robotics & Physical Simulation

Related Works:
AAAI'21, CVPR'22, ECCV'22, AAAI'23ORAL, ICCV'23, CVPR'24(1), CVPR'24(2), CVPR'25, IEEE TPAMI'26

Generative AI

  • Video & Image Content Creation
  • Sign Language Generation
  • TEM / Medical Image Analysis

Related Works:
CVPR'24(2), ECCV'24, ECCV'26(Under-Review)

3D Reconstruction

  • 2D, 3D, 4D Scene Reconstruction
  • Depth Estimation & Completion

Related Works:
ICML'23, ICLR'24, NeurIPS'24, ICCV'25HIGHLIGHT, ECCV'26(Under-Review)

Large Language Models

  • Vision-Language Alignment
  • Reasoning via VLMs
  • Vision-Language-Action

Related Works:
CVPR'24(2), IEEE TPAMI'26, ECCV'26(Under-Review)

Machine Learning for CV

  • Machine Learning Theory
  • Feature Encoding & Latent Mapping
  • Mathematical Formulation

Related Works:
CVPR'22, ICML'23, ICCV'23

  • Generative AI and Diffusion Model-based Dynamic Video Generation and Editing
  • 3D Scene Understanding and Geometry-based Multi-view Video Synthesis (Gaussian Splatting)
  • Multi-agent Dynamic Behavior and Interaction Reasoning based on LLMs and VLA Models
  • Physics-based Generative Models and Embodied AI