| Nome: | Descrição: | Tamanho: | Formato: | |
|---|---|---|---|---|
| 5.18 MB | Adobe PDF |
Autores
Orientador(es)
Resumo(s)
The computational complexity of Convolutional Neural Networks has increased enor mously; hence numerous algorithmic optimization techniques have been widely proposed.
However, in a space design so complex, it is challenging to choose which optimization will
benefit from which type of hardware platform. This is why QuTiBench - a benchmarking
methodology - was recently proposed, and it provides clarity into the design space. With
measurements resulting in more than nine thousand data points, it became difficult to
get useful and rich information quickly and intuitively from the vast data collected.
Thereby this effort describes the creation of a web portal where all data is exposed
and can be adequately visualized. All the code developed in this project resides in an
online public GitHub repository, allowing contributions.
Using visualizations which grab our interest and keep our eyes on the message is the
perfect way to understand the data and spot trends. Thus, several types of plots were
used: rooflines, heatmaps, line plots, bar plots and Box and Whisker Plots.
Furthermore, as level-0 of QuTiBench performs a theoretical analysis of the data,
with no measurements required, performance predictions were evaluated. We concluded
that predictions successfully predicted performance trends. Although being somewhat
optimistic because predictions become inaccurate with the increased pruning and quan tization. The theoretical analysis could be improved by the increased awareness of what
data is stored in the on and off-chip memory. Moreover, for the FPGAs, performance
predictions can be further enhanced by taking the actual resource utilization and the
achieved clock frequency of the FPGA circuit into account. With these improvements to
level-0 of QuTiBench, this benchmarking methodology can become more accurate on the
next measurements, becoming more reliable and useful to designers.
Moreover, more measurements were taken, in particular, power, performance and
accuracy measurements were taken for Google’s USB Accelerator benchmarking Efficient Net S, EfficientNet M and EfficientNet L. In general, performance measurements were
reproduced; however, it was not possible to reproduce accuracy measurements.
Descrição
Palavras-chave
Deep Learning Field Programmable Gate Arrays Graphics Processing Unit Benchmarks QuTiBench
