Inside Plus: Anurag Paul, Staff Research Engineer
Anurag Paul joined Plus as a Sr. Research Engineer in 2020 after interning with the company during graduate school. He was promoted to Staff Research Engineer in 2023. He studied Industrial Engineering at the Indian Institute of Technology in Roorkee. He previously worked as a Sr. Data Scientist before completing his Master’s Degree in Machine Learning from U.C. San Diego.
What drew you to the autonomous vehicle industry?
I have always been driven to solve complex problems which can have significant impact. Working in the autonomous vehicle industry gives one a chance to be at the forefront of technological innovation. It presents unique challenges to build efficient and safe driving systems that can improve the lives of people all over the world.
Tell us about your role at Plus.
My role is focused on solving Perception related objectives of autonomous driving by leveraging deep learning models. This involves designing data pipelines, model architectures, training processes and finally optimizing the model for deployment on edge devices. This role involves keeping abreast with the tremendous amount of research being produced by both academia and industry. I also really enjoy digging into the details, figuring out why the model behaves the way it does, and tweaking it for better performance, accuracy, and reliability.
What challenges keep you passionate about your work with deep learning and its applications? What do you find most rewarding about solving these challenges?
One of the unique features of AI and deep learning is that the technological landscape keeps changing every few months. Techniques that are popular today can become obsolete within 1 or 2 years. To remain ahead of the competition, we also have to continuously keep evolving and innovating to maintain our technology leadership. Thus, we have been working on several challenging problems.
One of the key projects that I have been working on is the 360-degree surround vision problem. We are using a multi-camera Bird’s-Eye-View (BEV) model for this, which is the centerpiece of our PlusVision product. Working on this model is highly challenging as it involves processing images from several cameras using CNNs (Convolutional Neural Networks), projecting features into BEV space and processing them using transformers to detect lanes, track obstacles and predict 3D occupancy for holistic scene understanding.
The complexity of this project ramps up because we are dealing with real-time systems, so ensuring that the latency of the model is as low as possible is crucial. This involves pushing the boundaries of deep learning systems to maintain a fine balance of high performance and low latency.
I think working on these cutting-edge problems is highly rewarding and motivates me to keep doing better.
What has been your proudest moment at Plus?
Developing a unified system for multi-task training of obstacle models. Basically, in deep learning, generally, we build models to solve one task, such as image classification or object detection. With the new framework, however, I was able to train a model to handle multiple tasks and multiple datasets. This allowed us to merge the capabilities of tens of models into a single model, making our operations more efficient.
What has kept you at Plus for over four years?
There is a remarkable degree of freedom in terms of innovation. If you can develop solutions that enhance the system’s performance, we get ample opportunities at Plus for further exploration. But not every idea can succeed. Continuous rapid experimentation is essential, and fortunately, we have a very good team with which you can collaborate. Innovation does not thrive in isolation.
How do you spend your downtime?
My wife and I like to travel and hike. I also play board games such as Catan and Splendor. At the office, I love to play table tennis. In fact, we hold an annual table tennis competition within the office, where I finished in second place last year.