Logo do repositório
 
A carregar...
Miniatura
Publicação

Gaussian Vector Quantization: Extracting Effective Speech Representations from Audio Data

Utilize este identificador para referenciar este registo.
Nome:Descrição:Tamanho:Formato: 
TCDMAA4356.pdf5.64 MBAdobe PDF Ver/Abrir

Orientador(es)

Resumo(s)

Di!erentiable vector quantization has become a prerequisite for the development of deep encoding models. Standard vector quantization methods are either only approximately di!erentiable or unable to fully capture the local distribution of its inputs. In this work we introduce the Gaussian vector quantization method, the first fully di!erentiable local vector quantization method. At its core, this method is composed by a Gaussian mixture modeling layer which is able to learn a Gaussian mixture distribution over its input data. The proposed implementation has a 𝐿(𝑀𝑁2) computational complexity for the forward and backward passes, as opposed to the 𝐿(𝑀𝑁3) time complexity associated with the naive implementation. We apply Gaussian vector quantization to audio encoding, and verify that this technique is able to generate more e!ective contextual representations of speech data compared to the standard vector quantization methods.

Descrição

Dissertation presented as the partial requirement for obtaining a Master's degree in Data Science and Advanced Analytics, specialization in Data Science

Palavras-chave

audio encoding speech encoding vector quantization Gaussian mixture modeling layer Gaussian vector quantization SDG 9 - Industry, innovation and infrastructure

Contexto Educativo

Citação

Projetos de investigação

Unidades organizacionais

Fascículo