TY - GEN
T1 - LAPTNet
T2 - 17th International Conference on Control, Automation, Robotics and Vision, ICARCV 2022
AU - Diaz-Zapata, Manuel
AU - Erkent, Ozgur
AU - Laugier, Christian
AU - Dibangoye, Jilles
AU - Sierra-Gonzalez, David
N1 - Publisher Copyright:
© 2022 IEEE.
PY - 2022
Y1 - 2022
N2 - Semantic grids are a useful representation of the environment around a robot. They can be used in autonomous vehicles to concisely represent the scene around the car, capturing vital information for downstream tasks like navigation or collision assessment. Information from different sensors can be used to generate these grids. Some methods rely only on RGB images, whereas others choose to incorporate information from other sensors, such as radar or LiDAR. In this paper, we present an architecture that fuses LiDAR and camera information to generate semantic grids. By using the 3D information from a LiDAR point cloud, the LiDAR-Aided Perspective Transform Network (LAPTNet) is able to associate features in the camera plane to the bird's eye view without having to predict any depth information about the scene. Compared to state-of-the-art camera-only methods, LAPTNet achieves an improvement of up to 8.8 points (or 38.13%) over state-of-art competing approaches for the classes proposed in the NuScenes dataset validation split.
AB - Semantic grids are a useful representation of the environment around a robot. They can be used in autonomous vehicles to concisely represent the scene around the car, capturing vital information for downstream tasks like navigation or collision assessment. Information from different sensors can be used to generate these grids. Some methods rely only on RGB images, whereas others choose to incorporate information from other sensors, such as radar or LiDAR. In this paper, we present an architecture that fuses LiDAR and camera information to generate semantic grids. By using the 3D information from a LiDAR point cloud, the LiDAR-Aided Perspective Transform Network (LAPTNet) is able to associate features in the camera plane to the bird's eye view without having to predict any depth information about the scene. Compared to state-of-the-art camera-only methods, LAPTNet achieves an improvement of up to 8.8 points (or 38.13%) over state-of-art competing approaches for the classes proposed in the NuScenes dataset validation split.
UR - https://www.scopus.com/pages/publications/85146749058
U2 - 10.1109/ICARCV57592.2022.10004254
DO - 10.1109/ICARCV57592.2022.10004254
M3 - Conference contribution
AN - SCOPUS:85146749058
T3 - 2022 17th International Conference on Control, Automation, Robotics and Vision, ICARCV 2022
SP - 281
EP - 286
BT - 2022 17th International Conference on Control, Automation, Robotics and Vision, ICARCV 2022
PB - Institute of Electrical and Electronics Engineers Inc.
Y2 - 11 December 2022 through 13 December 2022
ER -