Integrating state-of-the-art CNNs for multi-sensor 3D vehicle detection in real autonomous driving environments
AuthorsBarea Navarro, Rafael; Bergasa Pascual, Luis Miguel; Romera Carmena, Eduardo; López Guillén, María Elena; Pérez Gil, Óscar; [et al.]
IdentifiersPermanent link (URI): http://hdl.handle.net/10017/45108
Ministerio de Economía y Competitividad
Comunidad de Madrid
Barea, R., Bergasa, L. M., Romera, E., López Guillén, E., Pérez, O., Tradacete, M. & López, J. 2019, "Integrating state-of-the-art CNNs for multi-sensor 3D vehicle detection in real autonomous driving environments", en 2019 IEEE Intelligent Transportation Systems Conference (ITSC), Auckland, New Zealand, 2019, pp. 1425-1431
Description / Notes
2019 IEEE Intelligent Transportation Systems Conference (ITSC), Auckland, New Zealand, 27-30 Oct. 2019.
info:eu-repo/grantAgreement/MINECO//TRA2015-70501-C2-1-R/ES/VEHICULO INTELIGENTE PARA PERSONAS MAYORES/
info:eu-repo/grantAgreement/MINECO//TRA2015-70501-C2-2-R/ES/SMARTELDERLYCAR. CONTROL Y PLANIFICACION DE RUTAS/
info:eu-repo/grantAgreement/CAM//P2018%2FNMT-4331/ES/Madrid Robotics Digital Innovation Hub/RoboCity2030-DIH-CM
Attribution-NonCommercial-NoDerivatives 4.0 Internacional
© 2019 IEEE
This paper presents two new approaches to detect surrounding vehicles in 3D urban driving scenes and their corresponding Bird’s Eye View (BEV). The proposals integrate two state-of-the-art Convolutional Neural Networks (CNNs), such as YOLOv3 and Mask-RCNN, in a framework presented by the authors in  for 3D vehicles detection fusing semantic image segmentation and LIDAR point cloud. Our proposals take advantage of multimodal fusion, geometrical constrains, and pre-trained modules inside our framework. The methods have been tested using the KITTI object detection benchmark and comparison is presented. Experiments show new approaches improve results with respect to the baseline and are on par with other competitive state-of-the-art proposals, being the only ones that do not apply an end-to-end learning process. In this way, they remove the need to train on a specific dataset and show a good capability of generalization to any domain, a key point for self-driving systems. Finally, we have tested our best proposal in KITTI in our driving environment, without any adaptation, obtaining results suitable for our autonomous driving application.