Robotics & Multimodal Autonomy (RoMA) Lab at University College London

At UCL RoMA Lab, we scale foundation models to Visual–Language–Action (VLA) systems for robotics, transforming multimodal perception into intelligent, goal-directed behavior. Our work explores VLA systems built on vision–language and world models, enabling perception, reasoning, and control in embodied settings. We advance embodied AI by tackling generalization across different sensors and tasks, computational efficiency on resource-constrained hardware, and trustworthy human–robot interaction, with the goal of enabling autonomous systems that operate reliably in complex, dynamic environments.

Our Research

Our Research

We develop advanced foundation models for robotic systems, focusing on multimodal perception, planning, and control. Our research spans computer vision, machine learning, and robotic manipulation.

Our Team

Our Team

We are building a dynamic team of researchers passionate about robotics and AI, esp. visual-language-action models for robotic applications. We welcome diverse expertise and perspectives.