Design, implementation and evaluation of an acoustic source localization system using Deep Learning techniques

Vera Díaz, Juan Manuel

dc.contributor.advisor	Pizarro Pérez, Daniel
dc.contributor.author	Vera Díaz, Juan Manuel
dc.date.accessioned	2019-07-24T10:53:14Z
dc.date.available	2019-07-24T10:53:14Z
dc.date.issued	2019
dc.identifier.uri	http://hdl.handle.net/10017/38642
dc.description.abstract	This Master Thesis presents a novel approach for indoor acoustic source localization using microphone arrays, based on a Convolutional Neural Network (CNN) that we call the ASLNet. It directly estimates the three-dimensional position of a single acoustic source using as inputs the raw audio signals from a set of microphones. We use supervised learning methods to train our network end-to-end. The amount of labeled training data available for this problem is however small. This Thesis presents a training strategy based on two steps that mitigates this problem. We first train our network using semi-synthetic data generated from close talk speech recordings and a mathematical model for signal propagation from the source to the microphones. The amount of semi-synthetic data can be virtually as large as needed. We then fine tune the resulting network using a small amount of real data. Our experimental results, evaluated on a publicly available dataset recorded in a real room, show that this approach is able to improve existing localization methods based on SRP-PHAT strategies and also those presented in very recent proposals based on Convolutional Recurrent Neural Networks (CRNN). In addition, our experiments show that the performance of the ASLNet does not show a relevant dependency on the speaker’s gender, nor on the size of the signal window being used. This work also investigates methods to improve the generalization properties of our network using only semi-synthetic data for training. This is a highly important objective due to the cost of labelling localization data. We proceed by including specific effects in the input signals to force the network to be insensitive to multipath, high noise and distortion likely to be present in real scenarios. We obtain promising results with this strategy although they still lack behind strategies based on fine-tuning.	en
dc.format.mimetype	application/pdf	en
dc.language.iso	eng	en
dc.rights	Attribution-NonCommercial-NoDerivatives 4.0 Internacional	*
dc.rights.uri	http://creativecommons.org/licenses/by-nc-nd/4.0/	en
dc.subject	Acoustic source localization	en
dc.subject	Microphone arrays	en
dc.subject	Deep Learning	en
dc.subject	CNN (Convolutional Neural Network)	en
dc.title	Design, implementation and evaluation of an acoustic source localization system using Deep Learning techniques	en
dc.type	info:eu-repo/semantics/masterThesis	en
dc.subject.eciencia	Telecomunicaciones	es_ES
dc.subject.eciencia	Telecommunication	en
dc.contributor.affiliation	Universidad de Alcalá. Escuela Politécnica Superior	es_ES
dc.type.version	info:eu-repo/semantics/acceptedVersion	en
dc.description.degree	Máster Universitario en Ingeniería de Telecomunicación (M125)	es_ES
dc.rights.accessRights	info:eu-repo/semantics/openAccess	en

Files in this item

Name:: TFM_Vera_Diaz_2019.pdf
Size:: 2.763Mb
Format:: PDF
Description:: Trabajo Fin de Máster (TFM)

View/Open

This item appears in the following Collection(s)

TFM - Máster Universitario en Ingeniería de Telecomunicación [40]
TFM - Máster Universitario en Ingeniería de Telecomunicación

Attribution-NonCommercial-NoDerivatives 4.0 Internacional

Este ítem está sujeto a una licencia Creative Commons.