ERFNet: efficient residual factorized ConvNet for real-time semantic segmentation
Autores
Romera Carmena, Eduardo; Álvarez López, José Mª; Bergasa Pascual, Luis Miguel; Arroyo Contera, RobertoIdentificadores
Enlace permanente (URI): http://hdl.handle.net/10017/43227DOI: 10.1109/TITS.2017.2750080
ISSN: 1524-9050
Editor
IEEE
Fecha de publicación
2018-01Patrocinadores
Ministerio de Economía y Competitividad
Comunidad de Madrid
Cita bibliográfica
Romera, E., Álvarez, J.M., Bergasa, L.M. & Arroyo, R. 2018, "ERFNet: efficient residual factorized convNet for real-time semantic segmentation", IEEE Transactions on Intelligent Transportation Systems, vol. 19, no. 1, pp. 263-272
Palabras clave
Intelligent vehicles
Scene understanding
Realtime
Semantic segmentation
Deep Learning
Residual layers
Proyectos
info:eu-repo/grantAgreement/MINECO//TRA2015-70501-C2-1-R/ES/VEHICULO INTELIGENTE PARA PERSONAS MAYORES/
info:eu-repo/grantAgreement/CAM//S2013%2FMIT-2748/ES/ROBOTICA APLICADA A LA MEJORA DE LA CALIDAD DE VIDA DE LOS CIUDADANOS, FASE III/RoboCity2030-III-CM
Tipo de documento
info:eu-repo/semantics/article
Versión
info:eu-repo/semantics/acceptedVersion
Versión del editor
https://doi.org/10.1109/TITS.2017.2750080Derechos
Attribution-NonCommercial-NoDerivatives 4.0 Internacional
© 2018 IEEE
Derechos de acceso
info:eu-repo/semantics/openAccess
Resumen
Semantic segmentation is a challenging task that addresses most of the perception needs of intelligent vehicles (IVs) in an unified way. Deep neural networks excel at this task, as they can be trained end-to-end to accurately classify multiple object categories in an image at pixel level. However, a good tradeoff between high quality and computational resources is yet not present in the state-of-the-art semantic segmentation approaches, limiting their application in real vehicles. In this paper, we propose a deep architecture that is able to run in real time while providing accurate semantic segmentation. The core of our architecture is a novel layer that uses residual connections and factorized convolutions in order to remain efficient while retaining remarkable accuracy. Our approach is able to run at over 83 FPS in a single Titan X, and 7 FPS in a Jetson TX1 (embedded device). A comprehensive set of experiments on the publicly available Cityscapes data set demonstrates that our system achieves an accuracy that is similar to the state of the art, while being orders of magnitude faster to compute than other architectures that achieve top precision. The resulting tradeoff makes our model an ideal approach for scene understanding in IV applications. The code is publicly available at: https://github.com/Eromera/erfnet.
Ficheros en el ítem
Ficheros | Tamaño | Formato |
|
---|---|---|---|
ERFNet_Romera_IEEE_T_Intell_Tr ... | 2.806Mb |
|
Ficheros | Tamaño | Formato |
|
---|---|---|---|
ERFNet_Romera_IEEE_T_Intell_Tr ... | 2.806Mb |
|
Colecciones
- ELECTRON - Artículos [242]
- ROBESAFE - Artículos [37]