MIT Develops More efficient lidar sensing for self-driving cars
June 11, 2021
If you see a self-driving car out in the wild, you might notice a giant spinning
cylinder on top of its roof. That’s a lidar sensor, and it works by sending out
pulses of infrared light and measuring the time it takes for them to bounce off
objects. This creates a map of 3D points that serve as a snapshot of the car’s
One downside of lidar is that its 3D data is immense and computationally
intensive. A typical 64-channel sensor, for example, produces more than 2
million points per second. Due to the additional spatial dimension, the
state-of-the-art 3D models require 14x more computation at inference time
compared to its 2D image counterpart. This means that, in order to navigate
effectively, engineers first typically have to collapse the data into 2D - the
side effect of this is that it introduces significant information loss.
But a team from MIT has been working on a self-driving system that uses machine
learning so that custom hand-tuning isn’t needed. Their new end-to-end framework
can navigate autonomously using only raw 3D point cloud data and low-resolution
GPS maps, similar to those available on smartphones today.
End-to-end learning from raw lidar data is a computationally intensive process,
since it involves giving the computer huge amounts of rich sensory information
for learning how to steer. Because of this, the team had to actually design new
deep learning components which leveraged modern GPU hardware more efficiently in
order to control the vehicle in real-time.
“We’ve optimized our solution from both algorithm and system perspectives,
achieving a cumulative speedup of roughly 9x compared to existing 3D lidar
approaches,” says PhD student Zhijian Liu, who was the co-lead author on this
paper alongside Alexander Amini.
In tests the researchers showed that their system reduced how often a
human-driver had to take control over from the machine, and could even withstand
severe sensor failures.
For example, picture yourself driving through a tunnel and then emerging into
the sunlight - for a split-second, your eyes will likely have problems seeing
because of the glare. A similar problem arises with the cameras in self-driving
cars, as well as with the systems’ lidar sensors when weather conditions are
To handle this, the MIT team’s system can estimate how certain it is about any
given prediction, and can therefore give more or less weight to that prediction
in making its decisions. (In the case of emerging from a tunnel, it would
essentially disregard any prediction that should not be trusted due to
inaccurate sensor data.)
The team calls their approach “hybrid evidential fusion,” because it fuses the
different control predictions together to arrive at its motion-planning choices.
“By fusing the control predictions according to the model’s uncertainty, the
system can adapt to unexpected events,” says MIT professor Daniela Rus, one of
the senior authors on the paper.
In many respects, the system itself is a fusion of three previous MIT
MapLite, a hand-tuned framework for driving without high-definition 3D maps
“variational end-to-end navigation,” a machine learning system that is trained
using human driving data to learn how to navigate from scratch
SPVNAS, an efficient 3D deep learning solution that optimizes neural
architecture and inference library
taken the benefits of a mapless driving approach and combined it with end-to-end
machine learning so that we don’t need expert programmers to tune the system by
hand,” says Amini.
As a next step, the team plans to continue to scale their system to increasing
amounts of complexity in the real world, including adverse weather conditions
and dynamic interaction with other vehicles.
Liu and Amini co-wrote the new paper with MIT professors Song Han and Daniela
Rus. Their other co-authors include research assistant Sibo Zhu and associate
professor Sertac Karaman. The paper will be presented later this month at the
International Conference on Robotics and Automation (ICRA).