| Nome: | Descrição: | Tamanho: | Formato: | |
|---|---|---|---|---|
| 5.64 MB | Adobe PDF |
Autores
Orientador(es)
Resumo(s)
Di!erentiable vector quantization has become a prerequisite for the development
of deep encoding models. Standard vector quantization methods are either only
approximately di!erentiable or unable to fully capture the local distribution of its
inputs. In this work we introduce the Gaussian vector quantization method, the
first fully di!erentiable local vector quantization method. At its core, this method is
composed by a Gaussian mixture modeling layer which is able to learn a Gaussian
mixture distribution over its input data. The proposed implementation has a 𝐿(𝑀𝑁2)
computational complexity for the forward and backward passes, as opposed to the
𝐿(𝑀𝑁3) time complexity associated with the naive implementation. We apply Gaussian
vector quantization to audio encoding, and verify that this technique is able to generate
more e!ective contextual representations of speech data compared to the standard
vector quantization methods.
Descrição
Dissertation presented as the partial requirement for obtaining a Master's degree in Data Science and Advanced Analytics, specialization in Data Science
Palavras-chave
audio encoding speech encoding vector quantization Gaussian mixture modeling layer Gaussian vector quantization SDG 9 - Industry, innovation and infrastructure
