I am excited about research which advances the perception and control of mobile robotics. In particular, I am currently working on improving the robustness and accuracy of computer vision algorithms, leveraging geometry for self-supervised learning and developing end-to-end systems which can reason from perception to control.

Reasoning from Perception to Action

The computer vision systems which I develop are primarily motivated to extract a representation which can be used to make a decision or action. I’m interested in learning representations to understand scenes and control the behavior of real world robots. For example, we designed a reinforcement learning agent which can learn to drive a car with deep reinforcement learning. We also showed, for the first time, a self-driving car which learned to drive in simulation.


Scene Understanding

Scene understanding is a computer vision task to generate a representation of a scene which can be used to evaluate decisions or actions. This typically requires understanding information such as the scene’s semantics, geometry and motion. Initially, I worked on a semantic segmentation algorithm called SegNet. More recently, I have been interested in learning a representation from a multitask deep learning architecture.


Bayesian Deep Learning

Deep learning is great for achieving state-of-the-art results, however these models cannot understand what they don’t know. Bayesian deep learning (BDL) is a promising framework for understanding our model’s uncertainty. This paper is an introduction to Bayesian deep learning for computer vision. I have also found BDL useful for localisation, scene understanding and autonomous driving.

Input Image Semantic Segmentation Uncertainty
Bayesian deep learning for semantic segmentation. From left to right: input image, semantic segmentation and model uncertainty.


PoseNet was the first end-to-end deep learning algorithm for relocalisation - estimating the position and orientation of the camera from an image within a previously explored area. It works over large outdoor urban environments or inside buildings. It takes only 5ms to do this from a single colour image, here is a demo.

Check out some 3D reconstructions of King’s College and central Cambridge in your web browser.


Other Projects

Some more details of other projects, including an autonomous drone and augmented reality, can be found here.