Controlador MPC adaptativo por DQN orientado a la conducción autónoma

Oliveros Fernández, Jesús Ángel

dc.contributor.advisor	García Daza, Iván
dc.contributor.author	Oliveros Fernández, Jesús Ángel
dc.date.accessioned	2021-09-13T09:14:02Z
dc.date.available	2021-09-13T09:14:02Z
dc.date.issued	2021
dc.identifier.uri	http://hdl.handle.net/10017/49368
dc.description.abstract	El empleo de técnicas de aprendizaje por refuerzo para optimizar sistemas de conducción autónoma está en constante crecimiento en la actualidad. Los objetivos de este proyecto están basados en mejorar el comportamiento de un controlador MPC de trayectorias de un vehículo autónomo mediante el uso de técnicas de aprendizaje por refuerzo para estimar el valor óptimo de los pesos de la función de costes del MPC asociados al error de distancia lateral y de yaw. El proyecto se inicia desde una versión del controlador MPC en la que los pesos de error de distancia lateral y de yaw (wd y wy respectivamente) tomaban ambos un valor de 10000. Estos valores obtenidos experimentalmente no presentaban errores altos, aunque aun así se propuso encontrar los valores de pesos óptimos que consiguiesen mejorar el comportamiento del vehículo reduciendo en mayor parte ambos errores a lo largo de la trayectoria. Tras un estudio teórico de las técnicas de aprendizaje por refuerzo, se escogió el algoritmo Actor - Critic por sus similitudes de funcionamiento con las características del problema a resolver. Inicialmente se realizaron una serie de ensayos sobre el entorno de simulación CartPole para entender su funcionamiento y posteriormente implementarlo sobre el sistema del controlador MPC y analizar su comportamiento en el entorno de simulación de vehículos CARLA. El peso asociado al error de distancia lateral fue el primero en ser optimizado buscando así unos valores de este peso que consiguieran reducir lo máximo posible dicho error. Después, fue modificada la arquitectura del MPC y hubo que adaptar el algoritmo Actor - Critic a esta nueva versión para estimar unos nuevos valores de pesos óptimos de error de distancia lateral. Por último, de forma similar, en la versión final del controlador MPC se procedió a optimizar el error de yaw del vehículo de nuevo con el algoritmo Actor - Critic obteniendo un rango de valores de peso de error de yaw que consiguiesen reducir su error respecto a la versión original del MPC. En ambos casos se obtienen unos resultados bastante favorables que consiguen reducir significativamente los errores originales de distancia lateral y de yaw del MPC respecto a su versión original, además de ofrecer una respuesta mucho más suave del sistema, resultado así de una conducción más agradable y segura para los integrantes del vehículo.	es_ES
dc.description.abstract	The use of Reinforcement Learning techniques to optimize autonomous driving systems is constantly increasing nowadays. The objectives of this project are based on improving the performance of an MPC controller of trajectories of an autonomous vehicle by using Reinforcement Learning techniques to estimate the optimal value of the weights of the cost function of the MPC related to the lateral distance and yaw error. The project begins since an initial MPC controller version in which the weights of lateral distance and yaw (wd and wy respectively) were both 10000. These values experimentaly obtained do not reach to high errors, although it was proposed to search the optimal values of them that will get a better performance of the vehicle decreasing as much as posible both errors over all the trajectory. After a theoretical study of the different Reinforcement Learning techniques, it was chosen the Actor - Critic algorithim due to the similarities of its operation characteristics according to the problem to solve. At the beginning, there were tested some trials over the CartPole enviroment in order to understand its performance and then code it in the complete system with the MPC controller and analyze its performance over the simulation enviroment of autonomous vehicles CARLA. The weight related to the lateral distance error was the first to be optimized by searching the weights values that decreases it as much as possible. Afterwards, it was modified the architecture of the MPC controller and the Actor - Critic algorithim had to be modified also in order to adapt it to the new version of the MPC to search the new optimal weight values of lateral distance error. Finally, in a similar way, in the last version of the MPC controller there was obtained also with the Actor - Critic algorithim the optimal values of the weight associated to the yaw error that decrease as much as possible this error respect to the original version of the MPC. In both cases there were obtained excellent results that significantly decrease the original errors of lateral distance and yaw of the MPC, besides their offer an smoothness system response that leads to a more confortable and secure driving for the integrals of the vehicle.	en
dc.format.mimetype	application/pdf	en
dc.language.iso	spa	en
dc.rights	Attribution-NonCommercial-NoDerivatives 4.0 Internacional	*
dc.rights.uri	http://creativecommons.org/licenses/by-nc-nd/4.0/	*
dc.subject	Controlador	es_ES
dc.subject	Optimización	es_ES
dc.subject	Error	es_ES
dc.subject	Lateral	es_ES
dc.subject	Controller	en
dc.subject	Optimization	en
dc.subject	Yaw	en
dc.title	Controlador MPC adaptativo por DQN orientado a la conducción autónoma	es_ES
dc.type	info:eu-repo/semantics/bachelorThesis	en
dc.subject.eciencia	Electrónica	es_ES
dc.subject.eciencia	Ingeniería industrial	es_ES
dc.subject.eciencia	Electronics	en
dc.subject.eciencia	Industrial engineering	en
dc.contributor.affiliation	Universidad de Alcalá. Escuela Politécnica Superior	es_ES
dc.type.version	info:eu-repo/semantics/acceptedVersion	en
dc.description.degree	Grado en Ingeniería en Electrónica y Automática Industrial	es_ES
dc.rights.accessRights	info:eu-repo/semantics/openAccess	en

Files in this item

Name:: TFG_Oliveros_Fernandez_2021.pdf
Size:: 6.777Mb
Format:: PDF

View/Open

This item appears in the following Collection(s)

TFG - Grado en Ingeniería en Electrónica y Automática Industrial [196]
TFG - Grado en Ingeniería en Electrónica y Automática Industrial

Attribution-NonCommercial-NoDerivatives 4.0 Internacional

Este ítem está sujeto a una licencia Creative Commons.